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Foreword 



ETAPS 2000 was the third instance of the European Joint Conferences on Theory 
and Practice of Software. ETAPS is an annual federated conference that was 
established in 1998 by combining a number of existing and new conferences. 
This year it comprised five conferences (FOSSACS, EASE, ESOP, CC, TAG AS), 
five satellite workshops (CBS, CMOS, CoFI, GRATRA, INT), seven invited 
lectures, a panel discussion, and ten tutorials. 

The events that comprise ETAPS address various aspects of the system de- 
velopment process, including specification, design, implementation, analysis, and 
improvement. The languages, methodologies, and tools which support these ac- 
tivities are all well within its scope. Different blends of theory and practice are 
represented, with an inclination towards theory with a practical motivation on 
one hand and soundly-based practice on the other. Many of the issues involved 
in software design apply to systems in general, including hardware systems, and 
the emphasis on software is not intended to be exclusive. 

ETAPS is a loose confederation in which each event retains its own identity, 
with a separate program committee and independent proceedings. Its format is 
open-ended, allowing it to grow and evolve as time goes by. Contributed talks 
and system demonstrations are in synchronized parallel sessions, with invited 
lectures in plenary sessions. Two of the invited lectures are reserved for “uni- 
fying” talks on topics of interest to the whole range of ETAPS attendees. The 
aim of cramming all this activity into a single one-week meeting is to create a 
strong magnet for academic and industrial researchers working on topics within 
its scope, giving them the opportunity to learn about research in related areas, 
and thereby to foster new and existing links between work in areas that were for- 
merly addressed in separate meetings. The program of ETAPS 2000 included a 
public business meeting where participants had the opportunity to learn about 
the present and future organization of ETAPS and to express their opinions 
about what is bad, what is good, and what might be improved. 

ETAPS 2000 was hosted by the Technical University of Berlin and was effi- 
ciently organized by the following team: 

Bernd Mahr (General Chair) 

Hartmut Ehrig (Program Coordination) 

Peter Pepper (Organization) 

Stefan Jahnichen (Finances) 

Radu Popescu-Zeletin (Industrial Relations) 

with the assistance of BWO Marketing Service GmbH. The publicity was su- 
perbly handled by Doris Fahndrich of the TU Berlin with assistance from the 
ETAPS publicity chair, Andreas Podelski. Overall planning for ETAPS con- 
ferences is the responsibility of the ETAPS steering committee, whose current 
membership is: 
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Foreword 



Egidio Astesiano (Genova), Jan Bergstra (Amsterdam), Pierpaolo Degano 
(Pisa), Hartmut Ehrig (Berlin), Jose Fiadeiro (Lisbon), Marie-Claude 
Gaudel (Paris), Susanne Graf (Grenoble), Furio Honsell (Udine), Heinrich 
HuBmann (Dresden), Stefan Jahnichen (Berlin), PaulKlint (Amsterdam), 
Tom Maibaum (London), Tiziana Margaria (Dortmund), Ugo Montanari 
(Pisa), Hanne Riis Nielson (Aarhus), Fernando Orejas (Barcelona), 
Andreas Podelski (Saarbriicken), David Sands (Goteborg), Don Sannella 
(Edinburgh), Gert Smolka (Saarbriicken), Bernhard Steffen (Dortmund), 
Wolfgang Thomas (Aachen), Jerzy Tiuryn (Warsaw), David Watt (Glas- 
gow), Reinhard Wilhelm (Saarbriicken) 

ETAPS 2000 received generous sponsorship from: 

the Institute for Gommunication and Software Technology of TU Berlin 
the European Association for Programming Languages and Systems 
the European Association for Theoretical Gomputer Science 
the European Association for Software Science and Technology 
the “High-Level Scientific Gonferences” component of the European 
Gommission’s Fifth Framework Programme 

I would like to express my sincere gratitude to all of these people and organiza- 
tions, the program committee members of the ETAPS conferences, the organizers 
of the satellite events, the speakers themselves, and finally Springer- Verlag for 
agreeing to publish the ETAPS proceedings. 
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Donald Sannella 
ETAPS Steering Gommittee chairman 




Preface 



This volume contains the proceedings of the international conference Founda- 
tions of Software Science and Computation Structures (FOSSACS 2000), held in 
Berlin, March 27-31, 2000. FOSSACS is a constituent event of the Joint Euro- 
pean Conferences on Theory and Practice of Software (ETAPS). This was the 
third meeting of ETAPS. The previous two meetings took place in Lisbon (1998) 
and Amsterdam (1999). 

FOSSACS seeks papers which offer progress in foundational research with 
a clear significance for software science. A central issue is theories and meth- 
ods which support the specification, transformation, verification, and analysis of 
programs and software systems. The articles contained in the proceedings repre- 
sent various aspects of the scope of the conference described above. In addition 
to the invited lectures of ETAPS 2000, FOSSACS 2000 had one invited lecture 
by Abbas Edalat (Imperial College, London) “A Data Type for Computational 
Geometry and Solid Modelling”. 

These proceedings contain 25 contributed papers, selected out of a total of 68 
submissions. This has been the largest number of submissions to FOSSACS to 
date. The selection procedure was done through a virtual meeting of the program 
committee. Each paper was thoroughly evaluated by the members of the program 
committee and their subreferees. I would like to sincerely thank all of them for 
the excellent job they did during the very difficult process of selecting the papers. 
Special thanks go to Robert Maron for his help in organizing the WWW page for 
the selection process and for his continuous efforts in maintaining and processing 
large data files. 
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Abstract. We consider action-labelled systems with non-deterministic 
and probabilistic choice. Using the concept of norm functions m , we 
introduce two types of bisimulations (called (strict) normed bisimulation 
equivalence) that allow for delays when simulating a transition and are 
strictly between strong and weak bisimulation equivalence a la I26I36I37I . 
Using a suitable modification of the prominent splitter/partitioning tech- 
nique [25l30| . we present polynomial-time algorithms that constructs the 
quotient space of the (strict) normed bisimulation equivalence classes. 



1 Introduction 

Probabilistic aspects play a crucial role for a quantitative analysis of various 
types of parallel systems, such as systems that are designed on the basis of a 
randomized algorithms or computer systems with unreliable components. In the 
former case, probabilities can be used to specify the frequencies of the possible 
outcomes of an explicit probabilistic choice (“tossing a fair coin”); in the latter 
case, probabilities might express failure rates. Besides the probabilistic choices, 
the (transition) systems we consider allow for nondeterministic choices. These 
can be used for modelling probabilistic systems with asynchronous parallelism 
I41I19I18I34I51 where the non-determinism is used to describe the interleaving 
of the subprocesses. Moreover, as observed by several authors | 21I23I34| . the 
non-determinism can also be used to represent underspecifieation or ineomplete 
information about the environment. Due to the combination of non-determinism 
and probability, the design and analysis of such systems (with both types of 
choices) can be hard. 

Like for any kind of computer systems, the use of implementation relations 
(which compare two systems; thus yielding a formal definition of when a pro- 
gram V implements correctly another one V') have turned out to be useful for 
the design and the system analysis. In this paper, we restrict to the equivalences 
that yield a notion of proeess equality. There are several highly desirable condi- 
tions that any reasonable process equivalence « should fulfill, including e.g. the 
soundness for establishing quantitative linear time properties and congruence 

J. Tiuryn (Ed.): FOSSACS 2000, LNCS 1784, pp. l-ITgl 2000. 

(c) Springer- Verlag Berlin Heidelberg 2000 
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properties w.r.t. certain composition operators of a process calculus (such as 
parallel composition). A further crucial aspect is the development of methods 
that support the proof of the equivalence of two processes (i.e. deductive or al- 
gorithmic techniques to show V k, V'). The algorithmic methods are of great 
importance for automatic verification tools that take as their input a system V 
and its specification V' and decides whether V correctly implements V' . More- 
over, algorithms for computing the quotient space yield an abstraction technique 
which is highly relevant for the system analysis. For this, one replaces the states 
by their equivalence classes and then establishes the desired properties for the 
quotient space S/ ~ rather than the original state space S. Especially when 
we deal with weak equivalences (that abstract from internal computations) the 
switch from the original system S to the quotient space S/ k, might lead to 
a much smaller equivalent system; and hence can be viewed as a technique to 
combat the state explosion problem. 

Several (strong and weak) equivalences for various types of probabilistic sys- 
tems have been proposed in the literature. They range over the full linear and 
branching time spectrum and are extensions of the corresponding relations on 
LTSs. While in the fully probabilistic setting, the equivalences are studied under 
several aspects (compositionality, axiomatization, decidability, logical charac- 
terizations, etc.), see e.g. [24|1 n|22p2fll27f9p] . the treatment of equivalences for 
probabilistic systems with non-determinism is less well-understood. Most of the 
standard relations that have proven to be useful in the non-probabilistic setting 
have been extended for the probabilistic case; see e.g. for a trace-based rela- 
tion, [42128] for testing equivalences and |2f)ll9ll8l8bl48l87l85l88l89IJ for several 
types of (bi-)simulations. However, due to the combination of non-determinism 
and probability, the definitions are more complicated than the corresponding 
notions for non-probabilistic or fully probabilistic systems. Even though some 
important issues (like compositionality and axiomatization) have been addressed 
in the above mentioned literature, research on algorithmic methods to decide 
the equivalence of two systems or to compute the quotient space are rare. For 
strong bisimulation |26| and strong simulation |36| . polynomial-time algorithms 
have been presented in [^. To the best of our knowledge, the forthcoming work 
|32| is the first attempt to formulate an algorithmic method that deals with a 
weak equivalence for probabilistic processes with non-determinism. We are not 
aware of any complexity (or even decidability) result for weak bisimulation a la 
l86l^ or any linear time relation on probabilistic systems with non-determinism, 
e.g. trace distribution equivalence I35| P1 

Our contribution: We deal with probabilistic systems with non-determinism 
and action labels modelled by a probabilistic extension of LTSs where the (action- 
labelled) transitions are augmented with probabilities for the possible target 
states. Our model essentially agrees with the simple probabilistic automata of 



^ As (non-probabilistic) LTSs are special instances of probabilistic systems with non- 
determinism and the trace distribution preorder a la Segala is a conservative exten- 
sion of usual trace containment, the PSPACE-completeness for LTSs [25] yields the 
PSPACE- hardness for the trace distribution relation a la [35] . 
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mm)- Our main contribution is the presentation of novel notions of bisimu- 
lation equivalence which (in some sense) are insensitive with respect to internal 
transitions. More precisely, our equivalences are conservative extensions of delay 
bisimulation equivalence BolMl which relies on the assumption that the sim- 
ulation of a step of a process V by another process V’ might happen with a 
certain delay (i.e. after a sequence of internal transitions). The formal defini- 
tion of our equivalences is provided by a probabilistic variant of norm functions 
in the style of m- Intuitively, the norm functions specify bounds for the de- 
lays (i.e. the number of internal transitions that might be performed before a 
“proper” transition of a process V is simulated by a corresponding transition of 
an equivalent process V'). In the probabilistic setting where the combination of 
internal transitions leads to a tree rather than a linear chain, the norm func- 
tions yield conditions on the length of the paths in the trees corresponding to a 
“delayed transition”. Using a modification of the traditional splitter/partioning 
technique j‘25|3nj , we present polynomial time algorithms for computing the quo- 
tient spaces. Moreover, we briefly discuss some other aspects (compositionality 
w.r.t. parallel composition and preservation of linear time properties). 
Organization of the paper: Section El introduces our model for probabilistic la- 
belled transition systems. The definitions of norm functions and normed bisim- 
ulations are presented in Section E] In Section S] we present our algorithm for 
computing the bisimulation equivalence classes. Section E] concludes the paper. 

Because of space restrictions, we present our main results without proofs. We 
refer the interested reader to [S] where the proofs and other details (including 
results about various types of bisimulations and simulations) can be found. 



2 Probabilistic Labelled Transition Systems 

In (ordinary) LTSs, the transitions s-^t specify the possibility that the system 
in state s moves via the action a to state t. In this paper, we deal with a prob- 
abilistic variant of LTSs where any transition is augmented with a probabilistic 
choice for the possible target states (rather than a unique target state t as it is 
the case in LTSs) . That is, in the probabilistic setting, the transitions are of the 
form s-^fi where s is the starting state, a an action label and /i a distribution 
on the state space which specifies the probabilities /J,(t) for any possible successor 
state t. Non-determinism is present in our model since we allow several (possibly 
equally action-labelled) outgoing transitions of a state s. 

Notation 1 Let 5 be a finite set. A distribution on S' is a function /r : S — > [0, 1] 
such that '5 'mpp(/j,) = {s G S : /i(s) > 0} denote the support 

of pL', p,[A\ = for 0 A C S and ^[0] = 0. For s G S, p,] denotes the 

unique distribution on S with pl(s) = 1. Distr{S) denotes the collection of all 
distributions on S. If R is an equivalence relation on S then S/ R to denotes the 
quotient space of S with respect to R. The induced equivalence =r on Distr(S) 
is given by p =r p' iff p[A] = p'[A] for all A G S/R. We write [p]r for the 
equivalence class {p' p =r p} of p with respect to =r. 
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Definition 1. A probabilistic labelled transition system (PLTS for short) is a 
tuple {S,Act, — >) where S' is a finite set of states, Aet a finite set of actions 
(containing a special symbol and — > C S x Aet x Distr{S) a transition 
relation such that for all s S S and a € Aet, Steps^{s) = {p, : s-^p,} is finitely 
A probabilistic program is a tuple V = {S,Act, — >,Sinit) consisting of a PLTS 
(S, Act, — !•) and an initial state Sinit S S. 

Example 1. We consider a simple communication protocol consisting of a sender 
(that produces certain messages and tries to submit the messages along an unreli- 
able medium) and a receiver (that acknowledges the receipt and consumes the re- 
ceived messages). For simplicity, we assume that both the sender and the receiver 
work with mailing boxes that cannot hold more than one message at any time. 
The failure rate of the medium is 1%; i.e., 
with probability 1/100 the medium looses the 
messages and the sender retries to submit 
the message. In state Smit, the sender pro- 
duces a message and passes the message to the 
medium which leads to the state Sdei (where 
the medium tries to deliver the message via 
an internal action) . When the message is deliv- 
ered correctly, the state Sok is reached. In state 
Sok, the sender and the receiver can work in 
parallel (modelled by interleaving): the sender 
may produce the next message while the re- 
ceiver may consume the last message. 

The executions of a PLTS are given by the paths in the underlying directed 
graph. They arise through the resolution of both the non-deterministic and 
probabilistic choiceslf] Typically, one assumes that the resolution of the non- 
deterministic choices are not under the control of the system itself. The entity 
that resolves the non-determinism (the “environment”) can be formalized by a 
scheduler m (also called adversary or policy in the theory of MDPs [33]). 
Given a scheduler A, the system behaviour under A can be described by a 
Markov chain which yields a Borel field and probability measure on the paths 
that can be obtained by A. The details are not of importance for this paper and 
are omitted here. They can be found e.g. in the above mentioned references. 

3 Normed Bisimulation 

In ordinary LTSs, the several types of bisimulations (e.g. strong, weak branching 
or delay bisimulation |28I31I29I16I40H4] 1 establish a correspondence between the 

^ We refer to t as the internal action. All other actions are called visible. 

® Any finite LTS {S, Act, — >) (where — > C S' x Act x S) can be viewed as a PLTS. 
For this, we identify any transition with its probabilistic counterpart s-^p]. 

4 T' n j-L. ' u u ^2>A^2 i 

normally, a path is a sequence so — ^ si — >■ S 2 — ^ ••• where Si-i — ^ /ii, 
Si e Supp{pi). 
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states and their stepwise behaviour. Intuitively, they identify those states s and s' 
where any outgoing transition from s can be simulated by s' and vice versa. Most 
types of bisimulation equivalences on a LTS (S', Act, — >) can be characterized 
as the coarsest equivalence R on the state space S such that 
(Bis) If {s,s') £ R, C £ S/R and s-^C then s' S Pre*{a,C). 

Here, we write s-^C if s-^t for some t G C. Pre*{a, C) denotes a certain pre- 
decessor predicate. Intuitively, s' G Pre* (a,C) asserts that s' can “simulate” the 
transition s-^C. The formal definition of Pre*{a,C) depends on the concrete 
type of equivalence. E.g., strong bisimulation is obtained by using the predicate 
Pre''^'" {a,C) = {s' : s'-^C} while delay bisimulation equivalence [40114] fo- 
cuses on the idea that the simulation of a transition s-^t might happen with a 
certain delay (i.e. after a finite number of internal moves) and uses the predicates 
Pre’^^^i-) which are given by the following three conditionsJj 
(DO) C C Pre'^"\T,C) 

(Dl) If s^C then s G Pre’^^\a,C). 

(D2) If s^t and t G Pre’^^\a,C) then s G Pre'^"\a,C). 

|2tilj presented an elegant reformulation of strong bisimulation for a variant of 
PLTSs which takes the probabilistic effect of the transitions into account. For- 
mally, strong bisimulation equivalence ^sbis in a PLTSs is the coarsest equiva- 
lence R on the state space S such that for all (s, s') G R and transitions s— 
there is a transition s ' where p and p' return the same probabilities for 
all equivalence classes under R (i.e. p =ft p' , cf. Notation (T]). [00107] presented 
notions of weak and branching bisimulations for PLTSs. All these notions of 
bisimulation equivalences on a PLTS (S', Act, — >) can be characterized as the 
coarsest equivalence R on S such that 

(PBis) If (s,s') G R, M G Distr{S)/ =r and s-^M then s' G Pre*{a, M). 
Here, s-^M iff s-^p for some p G M. E.g., strong bisimulation equivalence is 
given by (PBis) using the predecessor predicate Pre"^'" {a, M) = (s' : s'-^M}. 

We now propose novel notions of bisimulation equivalence for PLTSs which 
are conservative extensions of delay bisimulation equivalence |¥1TT| . Intuitively, 
two states s, s' are identified iff any transition s-^p can be simulated by s' 
by first performing finitely many internal moves and then performing an a- 
labelled transition for which the outcome of the associated probabilistic choice 
agrees with Thus, we aim at an appropriate definition of the predecessor 
predicate Pre’^"\a,M) where (for a ^ t or p\, ^ M) s' G Pre’^"\a,M) states 
the possibility for s' to perform the action a (possibly with a certain delay) 
such that the associated distribution p' of the a-labelled transition belongs to 
M = [p]r. Conditions (DO) and (Dl) for Pre‘^^^{a,C) can easily be lifted to the 
probabilistic case (see conditions (BDO) and (BDl) below). 

® Thus, Pre‘'"'(a, C) = {s' : s' ^4 C} for a 7 ^ r and Pre'''''(r, C) = {s' : s' C}. 

® This informal explanation assumes that is a “proper” transition, i.e. either 

a A n or pi ^ M. Transitions of the form s — *p where all possible target states 
t G Supp{p) are equivalent to s can be viewed as “silent moves” and are not taken 
account when dealing with equivalences that abstract from internal computations. 
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(BDO) then s e Pre‘^^\T,M). 

(BDl) If s-^M then s G Pre‘^‘'\a, M). 

To adapt condition (D2) for the probabilistic setting, we have two possibilities 
depending on whether or not we allow for unbounded delays. For the simpler 
case, we require bounded delays which leads to condition (BD2). 

(BD2) If s — ^ V and Supp{v) C Pre'^'^\a, M) then s G Pre'^‘^\a, M). 

The resulting bisimulation equivalence only abstracts from the combination of 
finitely many internal moves (corresponding to a bounded delay) but cannot 
involve the effect of infinite r-paths {unbounded delays). In the communication 
protocol of Example [Tl one might argue that the states Sdei and Sok have the 
same observable behaviour as Sdei moves via r-transitions to Sok with probability 
1 . To formalize the effect of infinite t- loops, we use the concept of norm functions 
which was introduced in PH to reason about simulation-like relations in non- 
probabilistic systems. We slightly depart from the notations of PH and define 
norm functions in LTSs as partial functions with three arguments (a state s, an 
action label a and a set C of target states) and whose range are the natural 
numbers. If the value n{s,a,C) is defined then s G Pre'^^\a,C) in which case 
there is a r*-labelled path of length < n(s, a, C) from s to a state t where either 
t-^C or a = T and t € C. li s ^ Pre'^^\a,C) then n{s,a,C) is undefined 
(denoted n{s, a, C) = T). The formal definition of norm functions in LTSs arises 
by “refining” the above mentioned three conditions for Pre‘^®^(a, C) in the sense 
that we involve the length of a delayed transition. Formally, norm functions in 
LTSs are partial functions satisfying the following three conditions. 

(NO) n{s, a,C) =0 implies a = t and s £ C 
(Nl) n{s,a,C) = 1 implies s-^C 

(N2) If n{s,a,C) > 2 then there is a transition s^^t where n{t,a,C) < 
n{s, a, C). 

To adapt these three conditions to the probabilistic setting, we deal with a set 
M C Distr{S) as the third argument of a norm function. The modifications of 
(NO) and (Nl) are straightforward. In (N2) we require that n{s, a, M) > 2 implies 
the existence of a transition s-^v satisfying a certain condition. When we aim 
at bounded delays then we deal with the constraint n{t,a,M) < n{s,a,M) for 
all t G Supp{v). For unbounded delays, we require that n(t, a, M) is defined for 
all t G Supp{v) and n{t,a,M) < n{s,a,M) for some t G Supp{v)Y\ 

Definition 2. A norm function for a PLTS {S,Act, — >) is a partial function 
n \ S X Act X ^ IN which satisfies the following conditions. 

(PNO) n(s, a, M) = 0 implies a = r and p,\ G M. 

(PNl) n{s,a,M) = 1 implies s-^M (i.e. s-^p, for some p G M). 

^ These two conditions about Supp{v) guarantee the existence of a scheduler where, 
for any state s for which n{s, a, M) is defined, almost all paths starting in s lead via 
r’s to a state t where n{t, a, M) G {0, 1}. Thus, in this scheduler, with probability 1, 
s performs finitely many r’s followed by a transitions where p' G M. However, 

for this scheduler, there might be no upper bound for the number of r’s that will be 
performed before the action a. 
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(PN2) If n{s,a,M) > 2 then there is a transition s— where 

(i) n{t, a, M) yf _L for all t G Supp{v) 

(ii) n{t,a,M) < n{s,a,M) for some t G Supp{v) 
n is strict iff, in (ii), n{t,a,M) < n{s,a,M) for all t G Supp{v). 

Example 2. Consider the system on the right. 

Let M be the set of distributions p, that re- 
turn probability 1 for the x-states (i.e. M = 

{p : p{xi) + p{x 2 ) = 1}). Then, p\^, pl^ G 
M and S 2 ~^M. Thus, the partial 

function n with n{so,a, M) = 2, n{si,a, M) 

= n{s 2 ,a, M) = 1 and n(-) = T in all other 
cases is a strict norm function. 

Definition 3. Let (S', Act, — >) be a PLTS and R an equivalence on S. R is 
called a (strict) normed bisimulation iff there exists a (strict) norm function n 
such that for all a G Act and M G Distr{S)/ =r- if (s,s') G R and s-^M 
then n{s' , a, M) yf T. Two states s and s' are called (strictly) normed bisimilar 
(denoted s s' resp. s s') iff there exists a (strict) normed bisimulation R 
such that (s, s') G R. The equivalences and are adapted for probabilistic 
programs in the obvious way[E 

Example 3. It is easy to see that the states so> 

Si and S 2 in Example |7] are strictly normed 
bisimilar. For the simple communication proto- 
col of Example [U and the smallest equivalence 
relation R that identifies Sdei and Sok, there is 
a norm function with n {sok,T,[v]n,) = 0 and 
n {s del, cons, [sinit]R) = 2 but no strict norm 
function. Thus, Sok Sdei but Sok ^sn Sdei- The 
quotient system that we get when we identify the 
states by their normed bisimulation equivalence 
classes can be viewed as a failure-free specifica- 
tion (see the picture on the right). 

and Kign can be characterized by condition (PBis) with suitable defined pre- 
decessor predicates. The unbounded delay predecessor predicate Pre‘(l\a, M) 
is the set of states s where n{s,a,M) yf T for some norm function n. The 
bounded predecessor predicate Pref'^{a,M) is the set of states s such that 
n{s,a,M) yf T for some strict norm function rt0 Then, Pref'’‘{a,M) is the 
least set satisfying the three conditions (BDO), (BDl), (BD2). In what follows, 

® Recall that a probabilistic program is a PLTS with an initial state (Def. [T) . Let Vi be 
probabilistic programs with initial states Si, i = 1 , 2 . We define Vi V2 iff si S2 
where si, S2 are viewed as states in the composed system V\ tU V2 which arises from 
the disjoint union of the state spaces of V\ and P2. 

® For a LTS, viewed as a PLTS, the unbounded and bounded predecessor predicates 
coincide. More precisely, Pre‘''^''{a,C) = Pre)^ [a, Me) = Pref‘^''{a, Me) for any set 
C of states and Me = {pi ■ t G C}. 
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we simply write Pre‘^^\a, M) to denote Pre'l^\a,M) or Pre'lj^f{a, M) depend- 
ing on whether we deal with strict normed bisimulation or normed bisimulation 
equivalence. It is easy to see that (strict) normed bisimulation equivalence meets 
the general characterization of bisimulation equivalences in PLTSs via condition 
(PBis). More precisely, (strict) normed bisimulation equivalence is the coarsest 
equivalence Ron S such that (s, s') € R, M £ Distr{S)/ =r and s-^M implies 
s' € Pre‘^"\a, M). (Strict) normed bisimulation equivalence lies strictly between 
strong {^sbis) and weak (~u,feis) bisimulation equivalence a la |26I86I37| . i.e. 
~shis C C C ^wbis- The Communication protocol and its failure free 
specification are examples that demonstrate the difference between Kisn and 
A crucial property of a simulation equivalence is soundness w.r.t. parallel 
composition, since this allows for compositional analysis. Another very impor- 
tant property is soundess w.r.t. a specification logic. For divergent-free processes 
(processes without t- loops), our equivalences are sound for quantitive linear time 
properties. These express that, independent how the nondeterminism is resolved, 
the probability on a certain set of traces is larger than some number p. 

Proposition 1. If Vi and V 2 are divergence free and V\ P 2 , then Vi and 
V 2 satisfy exactly the same quantative lineair time properties. 

Proposition 2. Vi ~n P 2 implies Pi||Q ~n P 2 WQ and similarly for 

4 Decidability 

In this section, we present an algorithm that computes the (strict) normed 
bisimulation equivalence classes in polynomial time and space. The main idea of 
our algorithm is a modification of the prominent splitter/partitioning technique 
I25I30I (which is sketched in Figure [T]) that was proposed for computing the 
strong bisimulation equivalence classes in a non-probabilistic transition system. 
The basic idea is to start with the trivial partition y = {S} of the state space 

5 and then successively refine y by splitting the blocks B oi x into subblocks 
according to a refinement operator i?e/(x, a, C) that depends on a splitter, i.e. an 
action/block pair (a,C). More precisely, Ref{x,a,C) divides each block B G x 
into the subblocks B n Pre"*'’{a, C) and its complement B \ Pre"*^ {a, C). 0 Us- 
ing an appropriate organization of the splitters (resp. splitter candidates), this 
method can be implemented in time 0{mlogn) where n is the number of states 
and m the number of transitions (i.e. the size of — *■) m- The above sketched 
technique can easily be modified to compute several other types of bisimulation 
equivalence classes, such as the strong [2D] or weak [1| bisimulation equivalence 
classes in fully probabilistic systems, but fails for strong (and hence for normed) 
bisimulation in PLTSs when action/block pairs are used as splitters [3j. 



Ref{x,a,C) yields the partition IJ^g^ Ae/(B, a, C) where Ref{B,a,C) = {Bn 
Pre“'’'(a, C),B\ Pre‘''^{a, C)} \ {0}. 
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X := {sy, 

While X can be refined do 

choose some splitter {a, C) of x and put x := RefiXj C7); 

Return x- 

Fig. 1. Schema for computing the bisimulation equivalence classes in LTSs 



In the remainder of this section, we explain how the splitter/partitioning 
technique can be modified to get a polynomial-time algorithm for computing 
the (strict) normed bisimulation equivalence classes in a PLTS. 

Notation 2 We fix a PLTS {S,Act, — >). Let Ma = UsgS For Z 

to be a finite set, we write \Z\ to denote the number of elements in Z. Let 
n = jS”! the number of states, m = | — > | the total number of transitions and 
uIt = the number of r-transitions. We assume that Act does 

not contain redundant actions, i.e. we require that Ma yf 0 for all actions a. 

We use similar ideas as suggested in [3] where an algorithm for computing the 
strong bisimulation equivalence classes of a PLTS in time 0(mn(log m -I- log n)) 
is presented. The key idea is to refine the current state partition has be according 
to splitters of the form (a, M) where a is an action and M a subset of Ma ■ That 
is, we successively replace the current state partition x by 

Refix, a, M) = Use;, Ref{B,a,M) 

where Ref{B, a, M) = {B n Pre^^\a, M),B \ Pre'^'^\a, M)} \ {0}. 

Notation 3 A step partition is a set AA consisting of pairs (a, M) where M C 
Ma and such that, for any action a, {M : (a, M) G At} is a partition of Ala- We 
refer to the pairs (a, M) as step classes. Given a state partition the induced 
step partition At^ consists of the step classes (a, M) where Af S Ma/ =x 
p, =x p' iff p[C] = p'[C] for all C G x- 



X ■■= {5}; 

While X can be refined do 

choose some step class (a, M) of Atx Put x R^fix, M); 

Return x- 

Fig. 2. Schema for computing the bisimulation equivalence classes in PLTSs 

The rough ideas behind our algorithm are sketched in Figure [21 To keep book 
about the splitter candidates (a, M) we use a step partition Af (that agrees 
with Aix after any iteration) and a set SplCnd (e.g. organized as a queue) 
which contains the step classes that will serve as splitter candidates. Initially, 
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SplCnd consists of the “trivial” step classes {a,Ma). In each iteration, we first 
refine the state partition x according to a step class (a, M) G SplCnd which 

yields the new state partition Xnew = -Re/(Xi Then, we adjust A4 to 

Xnew, i-e. calculate Mnew = step classes {b,N') G Mnew \ -M 

are viewed as splitter candidates and are inserted into SplCnd. To derive Mnew 
from M we have to replace any step class (b, N) in M by the step classes (6, N') 
where N' G N/ =^. At the beginning of any iteration we have M = M^- 

Thus, for (b,N) G M and y, C G N we have v[B\ — y'[B\ for all B G x- 

Let B G X and B' = B n Pre‘^^\a,M), B" = B \ B' and C G {B',B”}, 
{b,N) G M and v, C G N. Then, v[C] = C[C'] iff v[B'] = v'[B'] and y[B”] = 
v'\B"]. These observations motivate the use of a set NewBl which contains only 
those blocks C G Xnew that are relevant for the computation of Mnew- More 
precisely, for any block B G x where \Ref{B,a,M)\ = 2, we choose a block 
Cg G Ref{B,a,M) such that \C'^\ < \B\/2 and define NewBl = {C^ : B G 
X, \Ref{B,a,M)\ = 2}. Then, Mnew can be derived from M by replacing any 
{b,N) in M by the step classes in Split {{b, N) , NewBl) which we compute as 
follows. We start with N = {{b,N)}; Then, for all C" G NewBl we replace any 
{b,N') in J\f by Split{{b,N'),C') where the operator Split {{b, N'),C') divides 
{b, N') into the step classes {b, N[), . . . ,{b, M) where N[, . . . ,N^ is the splitting 
of N' according to the probabilities for C'lUl These ideas lead to the algorithm 
sketched in Figure El 

Theorem 1. The (strict) normed bisimulation equivalence classes can be com- 
puted in time 0{mni\ogm + logn) + rriTn^) and space 0{mn). 

The remainder of this section is concerned with the proof of TheoremEl It follows 
from Prop. Eland Prop. |H We put xo = {S} and write Xi to denote the state 
partition y after the f-th iteration. Similarly, we use the notations Mi, SplCnd^ 
and NewBli with the obvious meaning. Let AllSplCnd = lJj>Q SplCnd^ the set 
of all step classes (a, M) that once serve as splitters for the state partition x 
and let AllNewBl = NewBli the set of all blocks C that once are used in a 
splitting operation Split{-,C'). Using set-theoretic arguments, we get: 

(i) \AllSplCnd\ < |AloUAfiU...| < 2(m - 1) 

(ii) \AllNewBl\ < |xoUXiU---| < 2(n — 1) 

(iii) J^c'eAiiNewBi \^'\ — nlogn. 

Proposition 3. The operations Ref{x,a,M) in step (2) of the algorithm in 
Fig.m can be implemented in time 0{mrn^) (where we range over all iterations) . 

Proof. Clearly, given the predecessor predicate Pre‘^‘^\a, M) for a fixed step 
class {a,M), i?e/(x, a, M) can be performed in time 0{n) when appropriate 
data structures are used. Combining (i) and the following Lemmatas[Tland[2]we 
get the desired bound for the time complexity. 

Split({b, N) , NewBl) returns the set of step classes {b,N'} where N' G N/ = and 

y = u' iS v[C] = v'[C'\ for all C' G NewBl. As N G Mi,/ yields that = and 

=xnen, colnclde, we get = U(6,iv>eAi Spht{(b, N) , NewBl). 
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X := {S}; M ■- {(a,Ma) : a € Act}-, SplCnd := M; 

While SplCnd / 0 do 

(1) choose some step class (a,M) of SplCnd and remove (a,M) from SplCnd; 

(2) (* Computation of Xne™ := Ref{x,a,M) *) 

P ■- Pre‘‘‘^\a,M); x„™ := 0; NewBl ■- 0 
For all -B G X do 

P; B" - B \ B'; x„e» := Xr>e» U {S', B"} \ {0}; 

If 0 / B V -B then 

If |B'| < |B"| then C ■- B' else C" — B"; 

NewBl ■- NewBl U {C"}; 

(3) (* Computation of Mnew ■= Afx™ *) 

•N't new 0 ; 

For all (b, N} £ M do 

Af := Split{{b,N), NewBl); Mnew := Mnew UAf; 

If \JV\ > 2 then SplCnd ■- SplCnd UJV; 

(d) X \new‘, AA 1= AAnew", 

Return x- 

Fig. 3. Algorithm for the (strict) normed bisimulation equivalence classes 



Lemma 1. Pre'l^^{a,M) can be computed in time 0{nirn). 

Proof. Pre'l^\a,M) is the least subset of S satisfying the conditions (BDO), 
(BDl) and (BD2). The standard iterative method for computing the least fixed 
point of a monotonic set- valued operator leads to the following method for com- 
puting Pre'l‘^\a,M). We consider the directed graph = (V,E) with 

the vertex set F = S' U Mr and the edge set E = {(:/, s) C Mr x S : 

U {(s, € S X Mr : s C Supp{v)}. We assume a representation of G^^\a,M) 

by adjacency lists and write E{-) to denote the adjacency list of (•). We use the 
algorithm shown in Fig. 2] to compute the set Pref^^{a,M). For any v £ N, 
we use a counter c(i/) for the number of states t £ Supp(y) where the con- 
dition t £ Pre'l^^{a,M) is not yet verified. Nq collects all v where c{v) = 0, 
i.e. Supp{v) C Pre‘jf^{a,M). Hence, if vq £ Nq and s-^vq then we may insert s 
into Pre'l)‘^\a, M). Clearly, this method can be implemented in time 0{nirn). 

Lemma 2. Pre'^f{a, M) can be computed in time 0{mrn). 

Proof. To compute Pre^^{a,M) we suggest a graph-theoretical method which 
is based on the following observation. Pre^l\a, M) is the least subset of S which 
contains Pre~^{a, M) = Pre^{a, M) U Pre^{a, M) (where Pre^{r, M) = {s : 
£ M}, Pre°{a,M) = 0 if a is visible and Pre^{a,M) = {s : s-^M}) and 
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Fig. 4. Algorithm for computing Pre'l^\a,M) 



satisfies the following condition. Whenever CCS such that for any s G C there 
is a finite path s = sq si s; t where si, . . . ,si € C, Si^^Vi 

with Supp{vi) C CU Pre‘l^f {a, M), t = 0, 1, . . . , and t € Pre'lf{a, M) then C C 
Pre'lf{a, M). On the basis of this characterization, we compute Pre^f (a, M) as 
follows. We start with Pre‘l^l\a, M) = Pre~^{a, M) Then, we successively add 
all states of a set C satisfying the above condition which can be reformulated 



by means of the strongly connected components (SCCs) in a certain directed 
graph. We consider the directed graph M) = {V, E) where the vertex set 

V is given by F = {(s, Jz) : v G Steps^{s),s ^ Pre~^{a, M)} U Pre~^{a, M) 



and the edge set is if = {((s, (s', :z')) : s G Supp(iy')} U {{u,{s' ,C)) : 

u G Pre-^{a,M) n Supp{C)}. First we compute the SCCs of G = G([^*(a, M) 



and a topological sorting Ci,...,Cr on them. The singleton sets {m} where 
u G Pre-^{a,M) are bottom SCCs (BSCCs) in G. Thus, we may assume that 
there is some h such that Pre~^{a, M) = GiU. . .UG/i and Ci C V\Pre~^{a, M), 
i = h+1, . . . ,r. We start with Pre([j^(a, M) = Pre~^{a, M). For i = /i+ 1, . . . , r. 



if Ci is not a BSCC (i.e. {j : Cj Ci} yf 0) and all states of a predecessor SCC 
Cj of Ci belong to Pre'^f{a, M) then we insert the states of Ci into Pre'^f {a,M). 
Clearly, G = G'^{a,M) has 0{mr + n) vertices and 0{mrn) edges. Hence, this 



method can be implemented in time and space 0{mrn). 
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Proposition 4. Ranging over all iterations, the computation of the step parti- 
tions A4new in step (3) of Figure takes 0{mn{logm -\-logn)) time. 

Proof. For calculating Split {{b, N'),C) we may apply a technique (similar to the 
one suggested in m which generates an ordered balanced tree (e.g. AVL tree) by 
successively inserting the values v[C’], v S N' , possibly creating new nodes and 
performing the necessary rebalancing steps. Any node v in this tree is labelled by 
a key value v.key (which is one of the probabilities v\C] for one or more v S N') 
and a subset v.distr of N' . Then, Split{{h,N'),C) is the set of pairs {b,v.distr) 
where f is a node in the final tree. The construction of the tree causes the cost 
0{\N'\ log |A^'|) as for any i/ G N' we traverse a tree of height < log |A^'|. For 
fixed V and C , the computation of the values v[C'\ can be implemented in time 
0(\C’\). Thus, for any call of the procedure Split{{b, N'),C) the time spent for 
computing the values v\C] is 0{\N'\ ■ \C'\) where we range over all v G N' . 
Summing up over all step classes {b, N) in the current step partition M, any 
C G AllNewBl causes the cost 0{mlogm m\C'\). (ii) and (iii) yield the time 
complexity 0(mn(logm + logn)) for all Split{-) operations together. 

5 Conclusion 

We introduced two notions of bisimulation equivalence in probabilistic systems 
(with non-determinism) that abstract from internal computations. We presented 
polynomial-time algorithms that compute the quotient spaces and briefly dis- 
cussed other important issues (soundness for establishing linear time properties 
and compositionality) . Thus, our notion of bisimulation equivalence yields an al- 
ternative to the weak and branching bisimulations of [SSEZ]- Although the equiv- 
alences a la are the natural probabilistic counterpart to weak/branching 

bisimulation equivalence in LTSs |28l31llfl] . their definitions are rather compli- 
cated and the decidability is still an open problem. We argue that the definitions 
of our equivalences - which rely on the rather intuitive concept of norm func- 
tions a la - are comparatively simple. Moreover, the use of norm functions in 
the definition of our equivalences allows for a characterization of the equivalence 
classes by means of graph-theoretical criteria which served as basis for our algo- 
rithm that computes the equivalence classes. In particular, the characterization 
of the delay predecessor predicates that we used in the proofs of Lemmatas |T] 
and|2]can easily be rewritten as terms of the relational mu-calculus. It would be 
interesting if our ideas can be combined with the techniques of mu for com- 
puting the bisimulation equivalence in LTSs with a BDD-based model checking 
algorithm for the relational mu-calculus seems to get a symbolic technique that 
might combat the state explosion problem for PLTSs. 

In this paper (where we mainly treated the issue of decidability) we restrict 
our attention to finite systems. However, norm functions and the derived notions 
of bisimulations can also be defined for infinite systemslil We believe that, as in 



12 



For our purposes, it was sufficient to consider the natural numbers as range of the 
norm functions. The framework of m also covers infinite, possibly uncountable, 
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the non-probabilistic case, in many applications, it is quite simple to “guess” a 
norm function and then to check (e.g. by hand) whether it fulfills the necessary 
conditions. Further on, the concept of norm functions can also serve as basis for 
simulation preorders that abstracts from internal moves and is computable in 
finite systems. Further details about normed simulations can be found in and 
the forthcoming work |39| . 
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Abstract. The Calculus of Inductive Constructions (CIC) is a powerful 
type system, featuring dependent types and inductive definitions, that 
forms the basis of proof-assistant systems such as Coq and Lego. We 
extend CIC with constructor subtyping, a basic form of subtyping in 
which an inductive type a is viewed as a subtype of another inductive 
type r if r has more elements than cr. It is shown that the calculus is well- 
behaved and provides a suitable basis for formalizing natural semantics 
in proof-development systems. 



1 Introduction 

Proof-development systems like Coq [Ij, Hoi |2T], Isabelle [2H] and PVS |22] rely 
on powerful type systems featuring (co-)inductive types. The latter, which cap- 
ture in a type-theoretical framework the notions of initial algebra or final coalge- 
bra, are extensively used in the formalization of programming languages, reactive 
and embedded systems, communication and cryptographic protocols. .. While 
such works witness that formal verification has reached a certain maturity, users’ 
efforts are often hindered by the rigidity of the existing tools. Thus providing 
type-theoretical tools for increasing the usability of proof-development systems 
remains an important objective. 

Subtyping is a relation on types that expresses that one type is at least as 
general as another one and is embedded in the type system via the subsumption 
rule, stating that a term of type a is also of type b whenever o is a subtype 
of b. While subtyping has long been perceived as a tool which could signifi- 
cantly improve the usability of proof-development systems, many of the existing 
approaches to subtyping are inappropriate for the (co-) inductive approach to 
formalization (see Section!^. 

Constructor subtyping [20 is a basic form of subtyping in which an induc- 
tive type (7 is viewed as a subtype of another inductive type r if r has more 
inhabitants than a. It is fully compatible with the (co-) inductive approach to 
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formalization and may be used to specify most of the examples arising in natu- 
ral semantics |28j . For example, constructor subtyping may be used to formalize 
the expressions of the call-by-value A-calculus and of the <j-calculus, or the set of 
Harrop formulae | 7I8| . It may also be used to formalize the datatype of lists/non- 
empty lists: 



Parameters: X 

Sorts: list, nelist 

Subsort relation: nelist < list 

Declarations: nil : X ^ list 

cons : X list ^ nelist 

The salient feature of constructor subtyping is to impose suitable coherence con- 
ditions on constructor overloading: roughly speaking, constructor declarations 
are supposed to be monotonic, i.e. if c : —> cr and c : B t are constructor 

declarations with cr E r, then one must have A Q B. This is trivially satis- 
fied in the case of the lists/non-empty lists, as constructors are declared only 
once, and in most datatypes with overloaded constructors such as the one of 
odd/even/natural numbers: 

Parameters: 

Sorts: even, odd, nat 

Subsort relation: even, odd < nat 
Declarations: 0 : even 

S : even — > odd 
S : odd ^ even 
S : nat ^ nat 

The immediate benefit of coherence is that the above definitions may be viewed 
as deterministic rule sets in the sense of [I] — see Section E] for the difficulties 
with datatypes that do not yield deterministic rule sets. Therefore they support 
recursive definitions and may be integrated safely to typed A-calculi m- 

In the present paper, we study constructor subtyping in the context of the 
Calculus of Inductive Constructions (CIC) [SS], a dependently typed A-calculus 
that forms the basis of several proof-assistants, including Coq [1] and Lego m- 
In particular, we show that adding constructor subtyping preserves some funda- 
mental properties of CIC, including confluence, subject reduction, strong nor- 
malization and decidability of type-checking. These results scale up to dependent 
types those of previous papers m, and open the road for an integration of con- 
structor subtyping in proof-development systems such as Coq and Lego. 

The remaining of the paper is organized as follows: in Section [2, we present 
an extension of the Calculus of Inductive Constructions with Constructor Sub- 
typing. As we shall explain, our presentation is slightly different from the one 
of j7l8j so as to scale up to dependent types. In Section [31 we prove that the 
main meta-theoretical properties of the Calculus of Inductive Constructions are 
preserved. While the confluence and strong normalization proofs rely on stan- 
dard techniques, both Subject Reduction and decidability of type-checking do 
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pose some interesting difficulties that are pervasive in all calculi which combine 
parametric inductive types and subtyping. In Section IH we extend the type sys- 
tem with unbounded recursion, leaving aside the somewhat intrinsic problem of 
guarded recursive definitions. We show the calculus remains confluent and enjoys 
the subject reduction property — strong normalization obviously fails and hence 
so does decidability of type-checking. Finally, Section concludes with related 
work and directions for further research. Most proofs are only sketched. 

2 Syntax 

A countably infinite set V of variables, written as x,y,z, . . . is assumed. We 
assume further a set T> of datatypes and a set C of constructors. Datatypes are 
written as cr, . . and constructors are written as /,.... Every datatype a and 
every constructor / comes equipped with a fixed arity, which is a natural number 
indicating the number of parameters it is supposed to have. The arity of a symbol 
s is denoted by ar(s). In addition, every datatype cr comes equipped with a set of 
constructors, denoted by C(cr). Finally, two sorts are assumed: the sort of types, 
written as *, and the sort of kinds, written as □. The set {*, □} is denoted by 
5. 

In the remainder of the paper, we will make use the following two examples. 
The first one is the datatype nat of natural numbers, with ar(nat) = 0 and 
C(nat) = {0,S}. We further assume ar(0) = 0 and ar(S) = 1. The second one is 
the datatype list of polymorphic lists, with ar(list) = 1 and C(list) = {nil, cons}. 
The argument of list is meant to specify the type of the elements of the list; 
for instance list nat represents the type of lists of natural numbers. We further 
assume ar(nil) = 1 and ar(cons) = 3. 

Pseudo-Terms. Pseudo-terms are built from the standard constructions for de- 
pendent types, datatypes and constructors, and case-expressions. The latter are 
annotated by their type (superscript) and by the type of the expression being 
matched (subscript). The purpose of both annotations is to guide type-inference; 
see Section Finally, note that at this point no constructor for fixpoints is 
present. The introduction of such a constructor is postponed until Section 

Definition 1. The set T of pseudo-terms is defined inductively as follows: 

1. 5U V C T; 

2. if a; S V and A, B G T, then Ux : A. B, \x \ A. B, A B G T', 

3. if cr G I? with ar(cr) = n, and Mi, . . . , M„ G T, then a Mi . . . Ain G T ; 

4. if / G C with ar(/) = n, and Mi, . . . , M„ G T, then /Mi . . . Mn G T; 

5. if (tG I? with ar(cr) = n, and C{a) = {fi, . . . , fk}, and further M, Mi, . . . , Mk, 
Pi,...,Pn,Q & T, then case^p^ M of |/i ^ Mi, . . . , fk ^ Mk} G T. 

The notions of free and bound variable, a-conversion and substitution are defined 
as usual. We write M[a: := N] for the result of substituting all free occurrences 
of a; in M by N. Further, we write A ^ B for Ux ■. A.B \i x does not occur free 
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in B. The set of closed pseudo-terms (i.e. without free variables) is denoted by 
Tq. Finally, P\ ... P„ is sometimes abbreviated as P. We then use the notation 
to denote the length of such a vector. 

Rewriting. We consider three reduction rules on pseudo-terms, for /3-reduction, 
/.-reduction, and ^-reduction respectively: 

1. (Aa; : A.M)N M[x := N]; 

2. case^p (/, P' Ni ... N^) of {/i ^ Mi, . . . , /^ ^ M^} M^ /Vi ... 

with and ar(/i) = ar(cr) -I- m; 

times 

3. f P N /□...□ Af where / € C(cr) with ar(cr) = and ar(/) = 

#P + #N. 

As usual, the reduction relation — >_r is defined as the smallest compatible closure 
of — >_R. The definitions of [3- and /.-reduction are standard. The K-reduction 
relation is less standard; it is there to enforce subject reduction to hold, and is 
only used in the type conversion rule (see below). We write for the reduction 
relation U and for the reduction relation U U 

Note that in the /.-reduction rule, P and P' are not required to be identical, 
the idea being that P' may be a subtype of P (pointwise). Note further that 
/^-reduction does not affect the number of arguments of /. 



Subtyping. We assume given a binary subtyping relation over T> that is reflex- 
ive and transitive, and that in addition satisfies the following two requirements 
for every a,r G T>: 

1. if a Ed r, then C(cr) C C(r); 

2. if C(cr) n C(r) yf 0, then ar(cr) = ar(r). 



An example also used in the remainder of the paper is that of odd and even nat- 
ural numbers. Those are represented by datatypes odd and even with ar(odd) = 
ar(even) = 0, and C(odd) = {S}, and C(even) = {0,S}. We assume given the 
subtyping relation odd C nat and even C nat. 

The relation Ed is used in the definition of the subtyping relation E- In 
the presence of dependent types, the subtyping relation needs to encompass 
the convertibility relation, and is therefore undecidable. For some purposes, in 
particular for the notion of strict overloading to be decidable, it is convenient to 
consider in addition a restricted notion of subtyping Es which does not account 
for convertibility. The two subtyping relations are defined as follows. 

Definition 2. 



1. The subtyping relation E is defined by the following rules: 



(refi) — 

^ ^ v4 r 4 


(trans) 


. ^ —pLK B 




(conv) 


(data) 



A C A' A' C A" , ,, A' C A B^B' 

(prod) 

A E A" Bx: A.B E Bx : A' .B' 

c Ed r Al E 33l • • • Aar(o-) E 33ar(CT) 

(T Al ... Aar(^) E T Bl ... 
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2. The conversion-free subtyping relation Cg is defined by all the rules above 
except (conv). 

A major design decision is that subtyping is defined independently of typing by 
defining it on pesudo-terms as in IT0T36I . This allows to break the circularity 
between typing and subtyping found in j^, where subtyping is only defined on 
legal terms and thereby depends on typing. 

The unusual rule (data) requires that inductive types are monotonic in their 
parameters. It is used for instance to derive list odd C list nat. An alternative 
is to consider a polarized calculus as e.g. in [HHl but most datatypes of interest 
are monotonic in their parameters, so we feel the complications are not justified 
here. 

Finally, note that the example of odd and even natural numbers illustrates 
that it is not possible to use set-theoretic inclusion on the set of constructors to 
define subtyping on datatypes; this would yield odd C even which is undesirable. 

Typing constructors. A next question is how to provide types for the datatypes 
and constructors. In order to do this, we assume given two mappings K : I? — > 7 q 
and D : J([cr G T>.C{a) % such that for every datatype a G T> and for every 

constructor c G C(cr) we have the following: 

1. K((t) is of the form IIx : A. * where ffx = ar(cr), 

2. D((t, /) is of the form IIx : A.IIy : B. a x where K(cr) = IIx : A.* and 
#a: + #y = ar(/). 

If D((t, /) = IIx : A. Ily : B.a x, we write D'^(cr, /) for IIx : A. By : B. □. 

Before proceeding with an example, let us emphasize that we do not con- 
sider inductive families since the codomain of D{a, f) is of the form ct a;. It is 
straightforward to add inductive families to our calculus, but more difficult to 
adapt the notion of constructor subtyping to inductive families, see Section |5]for 
a discussion. 

To illustrate the intended use of K and D, we consider the datatype list of 
polymorphic lists. Natural mappings K and D are defined by the following: 

K(list) 

□ (list, nil) = 77a: : *. list a; 

D (list, cons) = IIx : -k.x list x list x 

Using the typing system defined below, we have, with n : nat and I G list nat: 

list nat : * 
nil nat : list nat 
cons nat n I : list nat 

Overloading. In contrast with the Calculus of Inductive Constructions, con- 
structors may be overloaded. This is crucial to the applicability of constructor 
subtyping, as most examples require constructors to be overloaded. However, in 
presence of subsumption, overloading leads to difficulties with subject reduction. 
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To illustrate the problem, we consider two sorts: _L and T, with _L T. 
We assume the following constructors for these sorts: the constructor of _L is 
/ : T ^ _L, and the constructors of T are / : _L — > T and t : T. Note that 
the function / is overloaded. Further, some type A is assumed, and two terms 
g : ± ^ A and a : A. We have (we omit sub- and superscripts in the case- 
expression) 

case4 (/ t) of {f ^ g,t ^ a} : A 

This term reduces to g t, which is not typable. So subject reduction fails. 

Hence overloading must be constrained in some way. The solution advocated 
in m is to require the following: 

1. is anti-monotonic in its first argument: if / is a constructor for a and r 
with (7 r, then the domain of / w.r.t. r must be a subtype of the domain 
of / w.r.t. (t; 

2. if / is a constructor for a and r, then cr and r must be parameterized over 
the same types. 

Here we follow a similar approach but, in order to enforce decidability, we rely 
on Cg rather than on C to compare domains. 

Definition 3 (Strict overloading). A constructor / is strictly overloaded if 
for every a, t G V and / G C((t) n C(r) we have the following: 

1. if cr Cd r then D°(r, /) Cg D°(cr, /), 

2. K(cr) = K(r). 

The constructor S for successor of the datatype for odd and even and natural 
numbers is strictly overloaded. Indeed, we have: 



K(even) = K(odd) = K(nat) = ★ 



□ (even, S) 


= odd - 


even 


□'^(even, S) 


= odd ^ 


□ 


D(odd,S) 


= even - 


odd 


D°(odd,S) 


= even — 


^ □ 


D(nat, S) 


= nat — 


> nat 


D°(nat, S) 


= nat ^ 


□ 



So: 

D'^(nat, S) Cg □'^(even.S) 

D'^(nat, S)Cg D'^(odd,S) 

From now on, we assume all constructors to be strictly overloaded. 

Typing System. The typing system features the standard rules for the Calculus 
of Inductive Constructions, with the exception of the conversion rule which is 
replaced by the more general subsumption rule. Note that, in order for datatypes 
and constructors to be fully applied, the typing relation h is defined via some 
auxiliary relations F„ where n G N. We adopt the conventions Fq = b and 
0-1 = 0 . 
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Definition 4. The typing relation T h M : A is defined by the rules of figure 1, 
where in the (case) rule it is assumed that: 

□ 

1. C(u) = {/i, ... ,M; 

2. K(cr) = nx : A.*; 

3. D((t, fi) = IIx : A. Uyi : Bi. a X for all i with 1 < i < fc; 

4. Ci = Uyi : Bi[x := E].Q {fi E yi) for all i with 1 < i < k; 

5. Q M A[' . 

The premises in the (datatype) and (constructor) rules are meant to ensure 
that the types of datatypes and constructors are well-formed. Indeed, not every 
datatype will be legal: e.g. the datatype a with K((t) = IIx : is not legal. 



(axiom) 






h + : n 




(start) 


r\- A:s 




r,x : Ah X : A 




(weakening) 


rh A:s rh M :B 




r,x : Ah M : B 




(product) 


rh A: s r,x: Ah B : s' 




rh Bx: A.B : s' 




(abstraction) 


r,x : Ah M : B B h Bx : A.B : s 




rh Xx: A.M : Bx : A.B 




(application) 


FhnM : Bx: A.B F h N : A 




r MN : B[x := TV] 




(datatype) 


h K(ct ) : K 




bar(a) c- : K((t) 




(constructor) 


h D{a,c) : C 




bar(c) C : D(cr, c) 




(case) 


rh M : a E F h Q : g E ^ * F h Ni : Ci 


.ie 

VI 

VI 


r h casef E M of {/i N^} 


: Q M 


(subsumption) 


Fh M :A Fh B : s AFB 




rh M :B 






Fig. 1. Typing rules 
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To illiustrate the use of case-expressions, we consider a definition of the pre- 
decessor function on natural numbers. We have: 

C(nat) ={0,s} 

K(nat) = * 

D(nat, 0) = nat 
D(nat, S) = nat ^ nat 



Further, we use: 

Xx : nat. nat : nat ^ ★ 

0 : nat 

Xx : nat. x : nat ^ nat 

as Q, Nx, and N 2 in the (case) typing rule. Note that indeed {Xx : nat. nat) 0 
nat and Ily : nat. (Ax : nat. nat) (S y) (nat — > nat). Then we have: 

case"®^ M of {0 0, S — > Ax : nat. x} : nat 

with M : nat. The rewrite rules yield that we have 

casejjg^ 0 of {0 ^ 0, S — > Ax : nat. x} — > 0 

casejjg^ (S n) of {0 ^ 0, S ^ Ax : nat. x} — > (Ax : nat. x) n ^ n 



3 Metatheory 



Confluence. The first part of the following proposition follows because is 
orthogonal. 

Proposition 1. 

f- -^/ 3 l is confluent on the set of pseudo-terms. 

2- ^pLK. is confluent on the set of pseudo-terms. 



Subtyping. We present an alternative definition of subtyping, denoted by Emt, 
that is shown to be equivalent to the original one. The subtyping relation 
is used to prove subject reduction. 

Definition 5. The relation is defined by the following rules: 

/ C Ux : A. B C IIx : A'.B' A' A B Qint B' 
(prod) 



(data) 



C <j A C T B A^ Bi {1 < i < ar(cr)) 

C C' 



( conv) 



C C' 



C Qint C‘ 
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Here the reflexivity and transitivity rules are eliminated, and the conversion rule 
is distributed over the remaining ones. Note the system is not syntax-directed 
because of the (conv) rule. 

Proposition 2. A Qmt B if and only if A B. 

Proof. It follows by induction on the definition of Cmt that A Qint B implies 
H C H. 

Suppose that A B. We proceed by induction on the definition of C. The 
problematic case is if A C H is the conclusion of the rule (trans). First it can be 
shown by induction on the derivation of A 'Omt B that if A =0lk A' , B =f}iK B' 
and A 'Qint B, then A' 'Qint B' and moreover the derivation of both judgments 
have the same length. Then it can be shown that if A Qint B and B Qmt C,then 
A 'Qint C by induction on the derivation of A Qint B. 

Subject Reduction. The intermediate presentation of subtyping is used to prove 
the following lemma, which is crucial to prove subject reduction. 

Lemma 1. If IIx : A. B IIx : A' . B’ then A' A and B C B' . 

Using this lemma, subject reduction can be proved by adapting the standard 
proof for pure type systems (see for example jl5]) to the case of pure type 
systems with subtying, as also done in . 

Note however that the use of K-reduction in the (conv) rule is crucial. Indeed, 
consider the term M = cons even 0 (nil even) which has type list even. We have 
(using notation as in the definition of the typing rules) N\ : C\ and N2 : C2 
with N2 = Xn : nat. : list nat. iV2, Ci = Q (nil nat), C2 = Bh : nat. TTt : 

list nat. Q (cons nat h t), for some suitably typed Ni and and some Q : 
list nat — > *. Now we have: 

case M of {nil A^i, cons iV 2 } : Q (cons even 0 (nil even)) 

This term reduces to: 

N2 0 (nil even) : Q (cons nat 0 (nil even)) 

We have the conversion: Q (cons even 0 (nil even)) =«, Q (cons natO (nil even)) and 
hence by the conversion rule we have 

N2 0 (nil even) : Q (cons even 0 (nil even)). 



Proposition 3 (Subject Reduction). If P h M : A and M — M' then 
r h M' : A. 

Proof. The proof proceeds as in by induction on the structure of derivations, 
proving simultaneously the following two implications: ii P \- M : A and M -^/3i 
M' , then P h M' : A, and if T h M : A and P P' then P' \- M : A. 



26 



Gilles Barthe and Femke van Raamsdonk 



Here we only consider the case where the last typing rule is (case) rule. 
Let M = case^^ M' of {/i ^ TVi, ^ Nk} ■ Q M' with M' = fi E' P 

and M M" . For simplicity we restrict our attention to the case where the 
constructor fi has just one argument, and the type cr has just one parameter. 
Let the last rule of the derivation be the (case) rule, as follows: 

P'r M' -.a E P'r Q\a E P'r N,\C^ {l<i<k) 

P h case^; M' of {/i ^ iVi, ...,/„ ^ iV^} : g M' 

We have M Ni P. By generation, we have: E C E' , F h iV; : Ily : B\x := 
E].Q {fi E y), and P h F':A 

Moreover, there exists t a such that Dr{fi.) = Hx : A. Ily : B' . r H , and 
P h P ■. B'\x '.= E']. By strict overloading we have B' B. Because parameters 
occur positively in constructor declarations, we have B'\x := E'] C B\x := E], 
Subsumption yields that P h P ■. B\x ■.= E], and the rule for application that 
P \- NiP : Q (f I E P). Finally, by convertibility we have P \- NiP : Q (fi E' P). 

Termination. Thus far, we have not imposed any restriction on D and as a 
consequence the calculus is not terminating; in fact, it is possible to encode 
Girard’s system U into our calculus, see [T^ page 113]. 

In order to ensure termination, we must impose some conditions on D: con- 
structors must be monotonic w.r.t. parameters and datatypes. In order to handle 
mutual recursion, we introduce a precedence relation ◄ on T>, which is supposed 
to be a pre-order. Below we let ► be defined by r ► cr iff cr -4 r, ◄►be defined 
as ◄ n ► and ◄◄►be defined as ◄ \ ►. Moreover we require: 

1. ◄◄►is well-founded; 

2. the precedence relation is respected, i.e. if tz occurs in D(cr, c) then r ◄ cr; 

3. parameters must occur positively in the body of the declarations, i.e. if 
D(cr, c) is of the form IIx : A. Ily : B.a x, then every x G x occurs posi- 
tively in By . B.a x (the precise definition of positivity may be found e.g. 
in [T^l: 

4. datatypes must occur positively in the body of their declarations, i.e. if 
D(cr, c) is of the form IIx : A. By : B.a x, then for every t ◄► a every 
instance of r 2 : occurs positively in By : B.a x. 

Under these hypotheses, it is possible to show termination of our calculus by the 
well-known technique due to Tait and Girard, see e.g. naMi for an application 
to the Galculus of Gonstructions. 

Theorem 1 (Termination). If P \- M -.A then M is fib-terminating. 

It is then easy to conclude that legal terms are /JcK-normalizing and hence that 
convertibility between legal terms is decidable. 

Decidability. As usual, the type-checking algorithm is decomposed into: 

1. a type-inference algorithm which computes, if it exists, the minimal type of 
a term in a given context; 

2. a subtype-checking algorithm (SGA) based on Qint- 
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There are two problems with the existence of minimal types. We briefly explain 
what they are and how to solve them. 

1. The first problem has to do with the existence of least upper bounds and 
is independent of constructor subtyping. Consider a if . . . then . . . else state- 
ment: if 5 is a boolean, a : A and a' : A' then for every B such that A, A' C B, 
one has if b then a else a' : B. In order for the above expression (which is 
of course a case-expression in disguise) to have a minimal type, A and A 
must have a least upper bound. This needs not be the case in general, since 
we do not require to be an upper semi-lattice. To solve this problem, we 
require case-expressions to be tagged by their type. 

2. The second problem with minimal types is caused by constructor overload- 
ing. Suppose ^ V with arity 0 and / G C(J|k), C(4) with D{Ht,f) = 
nat ^ and D(4, /) = nat ^ 4. Then we can derive x : nat \- f x : Ht and 
X : nat h / a: : 4|k . If and 4|b are unrelated by subtyping, then / x does not 
have a minimal type. To solve this problem, we require constructors to be 
regular — a notion derived from order-sorted algebra, see e.g. m- 

First we introduce a notation. Let / G C(cr) for some datatype cr, and suppose 
that D((T, /) = Bx : A. By : B.a x with = ar(a). Recall that D'^(cr, /) de- 
notes Bx : A. By : B. □. We will use c) to denote By : B[x := E], □. 

Regularity is then defined as follows. 

Definition 6. A constructor f is said to be regular if for every datatype a , and 
for all terms P, E such that P C (cr, c), we have that the set 

{r G I F C D[^,^£](t,c)} 
contains a minimum element. 

Under the assumption of regularity, minimal types exist. 

Proposition 4. If all constructors are regular, then the calculus has minimal 
types, i.e. if P h M : A, then there exists A' G T such that P \- M : A' and 
A' C A” for every A” G T such that P \~ M : A” . 

Proof. By induction on the structure of terms. We only consider the case of a 
constructor term with one parameter and one argument, so suppose M = f E P 
and P h M : a E. By the induction hypothesis, the term P has a minimal type, 
say C, with P \- P : C. By generation, we have C U (cr, /). By regularity 

there exists a minimal p such that C U f)- easy to verify that 

the minimal type of M is p E. 

Note that the notion of regularity is based on C and is therefore undecidable. 
However, we cannot rely on Cs instead since conversion may lead to new conflicts. 
Consider for instance a slight modification of the example above, where D(<|k, /) 
is now defined by the equation D(4, /) = ((Aa : *. a) nat) ^ 4. As before, / x 
does not have a minimal type. 
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Also note that we need to consider instances of constructors, as instantiating 
constructors may lead to new conflicts. Consider for example € T> with arity 
1 and / e C(J|k), C(4) with D(^, /) = Ila : *. nat a and D(4, /) = Ua : 
*. a — > 4|la. Then we can derive x : nat h znata; : J|knat and x : nat h znata; : 4|knat. 
If and 4|l are unrelated by subtyping, then / x does not have a minimal type. 

The notion of regularity being undecidable, it is of some interest to provide 
some decidable sufficient condition for constructors to be regular. We present 
such a criterion. It is not the most general one possible, but it is relatively 
simple. 

The idea is to distinguish for each constructor a set of inductive positions 
and require overloaded declarations only to vary in these inductive positions. So 
for each constructor / S C, we assume given a set ip(/) C {1, . . . , ar(/)} and 
only allow constructor declarations to vary on these positions. 

Definition 7. 

1. A constructor / is safe if for every a,r GT> such that / G C(a) n C(r) with 
ar(cr) = ar(r) = n and 



D(o", /) = IIx : A. Ily : B. ax 
D(r, c) = IIx : A. Ily : B' . tx 



one has the following: 

(a) for every z G ip(/), Bi is of the form ai x; 

(b) for every z ^ ip(/), Bi = B'i. 

2. Let zi, . . . , Zfc is an increasing enumeration of ip(/). We let fa- denote Ilyi^ : 
Bi^. ... Ily^^ : □, and for pi, . . . , pk G T> we let f{pi, ■ ■ ■ , Pk} denote 

Ily-i^ : Pi X. ... Ilyi^ : pk x. □. 

The following proposition gives a sufficient condition 

Proposition 5. If f is a safe constructor and for every a G T> such that f G 
C((t) and p G T> such that f{p} Es Ca the set 

{t I f{p} Qs fr} 

has a minimal element, then f is regular. 

Proof. Assume D(c, a) = IIx : A. Ily : B. ax and let F,E G T such that F C 
Dj^_^](cr, c). Without loss of generality, one can assume F = By : C.\J with 
Ci = Bi[x := E]ifi ^ ip(c) and C* = pi E. Let c{F} = Ilyi^ : pi x. ... Ily^^ : 

Pk £c.n. It is then easy to check that for every t G T> such that c G C(r), 

c{F} Es Cr iS F n (t,c). 

We now turn to the SCA algorithm. It is obtained by specifying a reduction 
strategy for convertibility and by eliminating redundancies caused by the (conv) 
rule. Here we use the fact that legal types are either syntactically equal to a sort 
or weak-head reduce to a product Fix : A. B, a, base term xP or a datatype a A. 
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Definition 8. 



1. Weak-head reduction is the smallest relation such that for every x G V 
and A, P,Q, R G T we have: 



(Aa; : A.P) Q R P[x := Q] R 

(Weak-head reduction differs from /3-reduction by applying only at the top- 
level.) The reflexive-transitive closure of -^wh is denoted by -^wh- 
2. The SCA is given by the following rules: 



(prod) 



(data) 



(var) 



(sort) 



C ~^wh Rx : A.B C -^wh Rx : A' .B' A' ^alg A B Ea/g B' 

C Qalg C 

C A C T B a QdT Ai Qaig Bi (1 < / < ar(cr)) 

C Ea/g C 

C ^wh ^ A C ^wh ^ B Ai — jdi Bi (1 ^ ^ ^ n) 

C Ea/g C 

* EaZg * 



In order to complete the description of our algorithm, one needs to specify how 
to test convertibility between expressions. This may be done in exactly the same 
way as in |13| . although one has to take care not to compare the types of ar- 
guments in constructors (so as to handle K-conversion). The SCA algorithm is 
sound and complete w.r.t. C on legal types. 

Proposition 6. Assume T h A : s and P' h B : s' . Then A Qaig B iff A Q B. 



Proof. We use the fact that A Qmt 33 iff A C 33. Soundness is trivial. Complete- 
ness is proved by induction over the derivation of A Qint 33. 



Now decidability of type-checking follows from the existence of minimal types 
and the decidability of subtyping on legal terms. Hence we have the following 
result. 



Theorem 2 (Decidability of type-checking). If all constructors are regular, 
then P G M : A is decidable. 



4 Fixpoints 

In this section we indicate how the somewhat limited computational power of 
the calculus can be increased by adding mutually dependent fixpoints. This is 
done as follows: 
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1. The set T of pseudo-terms is extended with the clause: if ... ,Xn € V 
are distinct variables and ri, . . . , r„, ai, . . . , a„ G T, then letreci(a:i : ti = 
Cli, ■ ■ • , Xn . Tn — rin) G T, 

2. The typing relation is extended with the rule 

r f I ■. Ti, ..., fn ■■ Tn \- a^ ■. Ti (1 < Z < «) 
r h letreCj(/i : n = ai, = a„) : t,- 

with 1 < j < n. 

3. Fixpoint reduction is defined as the compatible closure of the rule: 

letrecj(a: : X a) 

ai[a;i := letreci(a; : t =k a), . . . , a;„ := letrec„(a; : t =k a)] 

While fixpoints are crucial for the expressivity of the calculus, their introduction 
leads to non-termination and undecidable type-checking. However, confluence, 
subject reduction and minimal types are preserved, as stated in the following 
proposition. The proof is omitted. 

Proposition 7. 

is confluent; 

2. If r G M : A and M N then F G N : A; 

3. If all constructors are regular and F G M : A then there exists A' gT such 
that F G M : A' and A' Cl A" for every A" G T such that F G M : A" . 

5 Concluding Remarks 

We have defined constructor subtyping for the Calculus Inductive of Construc- 
tions, a powerful type system that forms the basis of proof-development systems 
as Coq and Lego, and shown the resulting calculus to be well-behaved. A side- 
effect of our work is to provide a general approach to enforce subject reduction 
in calculi which combine parametric inductive types, dependent types and sub- 
typing [SHE!. 

Related work. Subtyping in dependent type systems is an active research area, 
with some main trends to be distinguished: 

— name inequivalence based subtyping assumes a subtyping relation on ground 
types; this relation is then extended to all types. Nordlander has 

been developing such a theory of subtyping for Haskell; his approach al- 
lows to capture those instances of constructor subtyping which do not in- 
volve overloading, such as the datatype of lists/non-empty lists, but fails to 
capture those instances of constructor subtyping involving overloading, as 
even/odd/natural numbers. 

On a more theoretical level. Poll has been investigating subtyping between 
(co-)inductive types [30131] . His approach, which is framed in a categorical 
setting, captures both constructor subtyping and its dual, destructor sub- 
typing, whose prime example is record subtyping. However, Poll does not 
focus on the syntactic aspects of this form of subtyping; 
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— declarative subtyping allows to declare X Q A : * in contexts and was origi- 
nally used in conjunction with related ideas, most notably bounded quantifi- 
cation , in order to provide a type-theoretical semantics of object-oriented 
languages, see e.g. |2^. However, declarative subtyping may also be used to 
represent formal languages in logical frameworks, see for motivations, 
examples and a dependent type system based on refinement types. The inter- 
action between dependent types and declarative subtyping has been studied 
by Aspinall and Compagnoni for the logical frameworks, by Chen m 
for the Calculus of Constructions and by Zwanenburg 1361 for Pure Type 
Systems. One major difference between and mm is that the former 
lets subtyping depend on typing, which leads to substantial complications 
in the theoretical study of the system. In order to avoid those, we have fol- 
lowed mm and defined subtyping independently of typing. More recently, 
Castagna and Chen m have extended Chen’s variant of Aspinall and Com- 
pagnoni’s AP< with late-binding. Their calculus is a significant improvement 
over XP< and allows to formalize the examples of m- However, declarative 
subtyping, even combined with late-binding, is not appropriate for the in- 
ductive approach to formalization; 

— implicit coercions allow to view a term a of type A as a term of type B 

whenever there is a previously agreed upon function, called coercion, from A 
to B. This approach, which leads to extremely powerful type systems, is im- 
plemented in several proof-development systems, including Coq and Lego, 
and has proved useful in several efforts to formalize mathematics in type 
theory. However, implicit coercions also yield intricate coherence problems: 
one would like to make sure that every two coercions from A to B are ex- 
tensionally equal, a property which is undecidable in presence of parametric 
coercions. Moreover, implicit coercions do not capture constructor subtyp- 
ing. 

Further work Much work remains to be done. We indicate some topics that 
deserve further investigation: 

— inductive families: it is straightforward to extend our calculus with induc- 
tive families. It is however more difficult to define constructor subtyping for 
inductive families. Such a form of subtyping is useful for formalizing type 
systems with subtyping. For example, consider a type system with a set of 
types object-types and a subtyping relation ^ on object-types; one would like 
to be able to define inductive families of the form el : Object — Types — *■ * 
such that el a Q el a' whenever a -< a' , where Object-Types is the datatype 
describing object-types. A possible approach to integrate constructor subtyp- 
ing to inductive families is to replace the (data) rule by 

^ l=rf (^Ij • ■ ■ ; ^ar(cr)) —(7 (Bi, . . . , 

a Ai ... Aar(a) ^ T Bi ... B^,(„) 

where Qa is a relation on tuples of pseudo-terms defined for each datatype 
a. This approach is currently under investigation; 
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— guarded recursive definitions: while fixpoints are crucial for the expressivity 
of the calculus, their introduction leads to non-termination and undecidable 
type-checking. In order to recover both properties, one needs to restrict the 
typing rule for fixpoints, for example by using a suitable notion of guard, see 
e.g. [TT], or a suitable type system based on subtyping, see e.g. |3I18| . We 
feel the latter approach provides an appealing alternative for our purpose 
but the technical details remain to be unveiled; 

— canonical inhabitants: as pointed out in [ 2 ], the system is not well-behaved 
with respect to canonical inhabitants: e.g. both nil even and nil nat are closed 
normal inhabitants of list nat. This example illustrates how the equational 
theory is too weak. In |B], we show that an ry-expansion rule for datatypes 
solves the problem in the simply typed case0 It should be possible to adopt 
the same solution for the Calculus of Constructions, although the combina- 
tion of 77 -expansion with dependent types is somewhat intricate IM- 

Addressing these issues should bring us closer to our overall objective, namely 
to integrate constructor subtyping as a primitive in proof-development systems. 
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Abstract. We address the problem of deciding performance equivalence 
for a timed process algebra in which actions are urgent and durational, 
and where parallel components have independent local clocks. 

This process algebra can be seen as a timed extension of BPP, a process 
algebra giving rise to infinite-state processes. While bisimulation was 
known to be decidable for BPP with a non elementary complexity, our 
main and surprising result is that, for the timed extension, performance 
equivalence is decidable in polynomial time. 



1 Introduction 

Performance of processes. In the field of concurrency semantics, there exists a 
well-developed and widely accepted approach based on equivalences that relate 
processes having the same behaviour [Mi189| iCla.Dflj . This framework has been 
extended in many directions in order to take various aspects into consideration: 
timing, causality, probability, locality, etc. 

In the timed framework, some efforts have been directed toward defining a 
robust notion of ^^performance’’’’ , that would allow comparing the efficiency of 
systems that have the same functional behaviour (what they do) but different 
speeds (how fast they do it). See, e.g., jMTQll [AH92L IFM95I IGRS951 ICGR.97I 
ICor98J . 

Durational urgent actions. The efficiency preorders and equivalences considered 
in [GRS951 IAM961 IGGR97llGor98| apply to process algebras where parallel com- 
ponents have their own independent local clocks, where actions have a duration 
and are urgent. Urgent actions take place as soon as possible and can only be 
delayed when one process must wait until synchronization with another process 
becomes possible. When the process algebra does not allow synchronization, this 
gives rise to a nice theory where performance equivalence is a congruence for all 
process constructors |GGR 97j . 

Verification. These earlier works mainly focused on semantics. However, verifi- 
cation issues have been addressed in this framework: 

J. Tiuryn (Ed.): FOSSACS 2000, LNCS 1784, pp. 35-E71 2000. 

(c) Springer- Verlag Berlin Heidelberg 2000 
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(1) In [CP96| . lazy performance equivalence is shown decidable over a class of 
systems having only finite control but allowing for an infinite number of config- 
urations when taking into account the values of the local clocks. 

(2) In |CC97| . a model checking problem for TAL (a modal logic with time) is 
shown decidable over another class of systems with finite control. 

In both cases, the decision method relies on building a finite approximation of 
the system, on which the original problem can be solved with standard finite- 
state methods. This induces algorithms with exponential running time since the 
finite approximation has exponential size[3 These results can probably be imple- 
mented with only polynomial-space requirements, but the issue is not addressed 
and no lower bounds for the structural complexity of the problems are given. 

Removing the finite control assumption. To the best of our knowledge, when 
systems have a potentially infinite number of control states (disregarding clock 
values), nothing is known about verification issues for these processes algebra 
with urgent actions and local clocks 0 This is probably because the problem 
combines two difficulties as it lies at the intersection of two recent fields: verifi- 
cation of timed systems and verification of infinite untimed systems. 

Our contribution. In this paper we investigate the decidability of the performance 
equivalence introduced in |CGR97| when no finite-state restriction is made. Be- 
cause no synchronization is considered in this framework, the resulting systems 
have a “BPP -I- Time” flavor [5, in a setting with local clocks. Hence our use of 
“TBPP” to denote this algebra. 

Decidability of bisimulation for (untimed) BPP is known, via an elegant 
algorithm (alas with non-elementary complexity) | |CHM9,3] . The connection with 
BPP is what motivated our study: we wanted to see whether local clocks could 
be dealt with. 

Our main result is that performance equivalence is decidable for TBPP, and 
can be decided in polynomial-time (it is in fact PTIME-complete). Surprisingly, 
the addition of local clocks does not make the problem harder: they allow decom- 
posing systems in a way not unlike what happens for normed processes |HJM96 | . 

This is good news since algorithms for the analysis of well-behaved infinite- 
state systems have important applications, ranging from static analysis [EK99] to 
modeling and verification of communication protocols |CFP95j . This also justifies 
our view that negative results about basic process algebra are not always the 
last word, and that the field still contains many unexplored paths. 

^ Since the approximation is based on the idea that exact clock values (or differences 
between then) can be forgotten when they are large enough, this has similarities with 
the region graph technique of [ACDQ.Sj . 

^ In the better known global clock framework, we are aware of |A,T98j where the sys- 
tems may have infinitely many distinct states. In addition, there exists a large body 
of literature on Timed Petri Nets, but most of these works do not offer decidability 
results for unbounded nets. 

BPP is the algebra of Basic Parallel Processes |Chr93j . 
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Plan of the paper. Section [5] introduces our notation for TBPP, and the oper- 
ational semantics while Section |3] introduces the performance equivalences we 
consider. The main technical part starts with the introduction of the syntactic 
congruence (Section [4| and the cancellation lemmas (Section that allow us to 
prove the main result in Section 

Our presentation of TBPP is mostly orientated towards the proof of the main 
result: we refer to [ICOB97] for motivations, examples, and further discussion of 
this process algebra. 



2 Timed Basic Parallel Processes 

In this section, we define the timed process algebra TBPP as a timed extension 
of BPP. This definition is based on the features proposed in jORS95[ IAM96I 
ICOR97] : 

— The time domain is the set N of natural numbers. 

— We consider urgent and durational actions: a duration function associates 
its duration (number of time units taken for execution) with each action. 
This mapping is external to the syntax. 

— Parallel components have independent clocks and executions are asynchro- 
nous and ill-timed hut well-caused. 

|A M96| showed the technical advantages of the “ill-timed but well-caused” view- 
point (which admits an intuitive understanding in terms of external observation) . 
In this framework, time is not used to enforce a synchronous view of the system. 

We mainly deviate from |GRS95L ICGR97| by two technical points that do 
not bring any real semantical change: 

— The date n in a step u v denotes the beginning time for a, not the 
completing time. 

— Instead of defining processes through recursive equations (as is traditional 
in process algebra), we adopt Moller’s approach where behaviour is defined 
via a set of rewrite rules IMol96l . This is for technical convenience only. 



2.1 Syntax 

We consider a set of action names Act ranged over by a,b, . . . and a set of 
process variables X ranged over hy X,Y, ... . 

Definition 2.1. The set T of TBPP-terms is given by the following abstract 
syntax: 



t, u ::= 



Nil I X I t II M I 1 > t. 
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As usual, Nil denotes the empty process which cannot proceed with any action, 
and t II u is the parallel combination of t and u (no synchronization is possible). 
1 > t denotes the process which behaves like t, but with a one time unit delay. 

n n 

/ , / , 

We write nt> t for 1 c> (1 > (• • • (l>t) • • • )) for A || A || • • • || A. By 

convention, 0 > t stands for t and A° stands for Nil. 

For a term t, we denote by V ar(t) the set of process variables occurring in 
t, e.g., Far(A || 1 > (A || A)) = {A,y}. 

Definition 2.2. A TBPP declaration is a finite set A C X x Act x T of pro- 
cess rewrite rules, written {Aj ti \ i = 1,... ,n}, such that Var{ti) C 

{Ai, . . . , A„} for any i. 

Note that the Xfs need not be distinct. Additionally, we require that any vari- 
able Ai used in A appears in the left-hand side of at least one rule from A (this 
is for technical convenience only). 

In the examples, we often use the convenient CCS-like notations with action- 
prefixing, non-deterministic choice (denoted by -I-) and guarded recursion. E.g., 
the definition 

Ai =*" a.(l [> (a II a || a)) -I- a.(l c> (a. a. a)), 

X 2 = a.(l [> (a II a || a)) -I- a.(l c> (a.a) || 1 > a) -I- a.(l c> (a. a. a)) 

is just a shorthand for 

r Ai A 1 > (Z, II II Za), Ai A 1 > Za.a.a, 

A =W X 2 A 1 > (Za II II Za), A 2 A 1 > (Za.a || 1 > ^a), X 2 ^ 1 > Z 

[ Za A Ml, Za.a ^ Za, Za.a.a ^ Za.a- 




2.2 Operational Semantics 

The evolution of a TBPP process is represented by a transition system where 
the steps carry visible labels of the form (a,n), where a G Act is an action and 
n G N is the time at which the step occurs. Actually, n is the time at which the 
step starts, and knowing when it finishes requires knowing the duration of a (the 
time it takes to perform an a). 

Definition 2.3. A duration function f is a mapping from Act to N \ {0}. 

1 is the constant duration function s.t. 1(a) = 1 for any a. 

Having f(a) = 3 means that a takes 3 time units. Here a duration function 
may represent for instance the performance of a particular machine. Thus this 
framework makes it possible to clearly distinguish the functional definition A 
and the performance definition /. 
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Remark 2. 4- It is possible to generalise duration “functions” so that (possibly 
infinite) sets of values are associated with actions. Our main decidability result 
is still valid in this framework (assuming that / is given effectively, for example 
by having /(a) be a recognizable set of natural numbers) but the complexity 
measures are affected. □ 

A pair {A, /) where A is a TBPP declaration and / a duration function 
defines a labeled transition relation ^/C T x {Act x N) x T, where is given 
inductively via the following SOS rules: 



A 




f{a)\>t 



{X ^At)eA 



t t' 



t\\u^f t' \\u 



t t' 



, a,n+l 

lot — ^ > f lot' 





We use the usual standard abbreviations: t ^ t' (with w G {Act x N)*), 
^ t', ... and omit the / subscript when it is clear from the context. 



A run of t is a finite or infinite sequence {t =) to — ^ ti 



^ 2,^2 



^2 ' 






tk ■ ■ ■ ■ The trace of such a run is the sequence w = (oi, ui)(a 2 , U 2 ) ■ ■ ■ (ufc, Uk) ■ ■ . 

A run is ill-timed if there are two positions i > j s.t. < Uj. TBPP allows ill- 
timed runs, but |A M9HJ argues convincingly that (1) this brings no semantical 
problem since “the ill-timed runs are well-caused” (i.e. local, causaly related, 
clock values do increase along a run) , and (2) this greatly simplifies the technical 
treatment (see also |CGR.97| ). 



Example 2.5. Consider / = 1 and the term X given by X a{bb |j c) -I- ac{b |1 
b). The maximal traces of X are (o, 0)(6, 1)(6, 2)(c, 1), (a, 0)(6, l)(c, 1)(6, 2), 

{a, 0)(c, 1)(6, 1)(6, 2) and (a, 0)(c, 1)(6, 2) (6, 2). The first one is ill-timed. 



2.3 Timing Measures 

Two structural measures can be associated with a term: minclock{u) G N U{oo} 
is the earliest time at which u can start an action, while maxclock{u) G NU{— oo} 
is the latest time. 

We assume ordering and addition over N are extended in the obvious way to 
oo and — oo, and we define the two measures by structural induction over terms: 

minclock{Nil) oo maxclock{Nil) = — oo 

minclock{X) 0 maxclock{X) = 0 

minclock{l i> u) 1 -I- minclock{u) maxclock{l > m) 1 -I- maxclock{u) 

minclock{u || v) = min{minclock{u), minclock{v)) 
maxclock{u || v) = max{maxclock{u),maxclock{v)) 
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Example 2.6. For u = !>{X || 2>X) we have minclock(u') = 1 and maxclock{u) = 
3, and indeed if A contains X t, then u . . . and u . . . 

More generally: 

Lemma 2.7. For any u, u v implies mineloek{u) < n < maxeloek{u) and 
mincloek{u) < minclock{v). 

In the other direetion, if u ean make a move, then there exists a move u AAA v 
with n = mineloek{u) and a u ° > v' with n' = maxcloek{u) . 

Proof. Easy induction on u. □ 

More fundamental is the following lemma, stating that minclock can be made 
arbitrarily large: 

Lemma 2.8. For any u and any n € N there is a u ^ v s.t. minelock{v) > n. 

Proof. An easy induction on u shows that if minclock (u) < oo then that u ^ v 
for some v s.t. minclock (v) > minclock (u). □ 

3 Performance Equivalences 

In this section, we recall the definition of performance equivalence introduced 
in [DESMlEXIEnZ]: “ f -performance equivalence" is associated with a duration 
function / while independent-performance equivalence" abstracts from the par- 
ticular duration function. 

/-performance equivalence corresponds to strong bisimulation [Mil8h] on 
TBPP transitions, taking timing information into account. 

Definition 3.1. A relation TZ C T x T is called a /-performance relation if 
uTZv implies that 

1. for any u u' there is a move v --^/ v' s.t. u'TZv' , 

2. and vice versa: for any v f v' there is a u --^f u' with u'TZv' . 

Definition 3.2. Two TBPP terms u and v are /-performance equivalent (writ- 
ten u v) if there is a f -performance relation TZ such that uTZv. 



Example 3.3. Assume /(a) = 1 and consider X A a.{X || X) and Y A a.Y. 
Then X Y because the steps X 1 > (A || X) 1 c> (A || 1 c> (A || 

A)) !>(![> (A II A) II 1 [> (A II A)) cannot be imitated by Y. (However A 

and Y are bisimilar when timing is not taken into account: they both behave as 



Verifying Performance Equivalence for Timed Basic Parallel Processes 



41 



As expected, ~/ is the largest /-performance relation, it is an equivalence, and 
a congruence for the || and lc> operators: 

Proposition 3.4. Ifu^f v and u' v' then li>u lou and u \\ u' v \\ v' . 

Proof. A consequence of the fact that the SOS rules for — > / are in tyft/tyxt, or 
even De Simone’s, format [CV92] . □ 

Additionally, u '^f v entails minclock(u) = minclock{v) and maxclock{u) = 
maxclock(v), as a consequence of Lemma l2.7l 



/-performance equivalence enjoys the usual associativity, commutativity and 
nilpotence laws. The distributivity law, (Eq4), is called a clock distribution equa- 
tion in | CGR97| : 

Proposition 3.5. For any terms t, u, v 



u\\t^ft\\u (Eql) 

{u II t) \\v ^fu II {t II v) (Eq2) 
t\\Nilr^ft (Eq3) 



1i>(m II v) {lt>u) 
1 > Nil ~ f Nil 



(1 [> u) (Eq4) 
(Eq5) 



3.1 Performance Not Depending from /. 

Our definitions followed !CGR97j in that we did not mix functional definitions 
(the rules in A, the program, . . . ) and timing definitions (the duration function 
/, the hardware, ... ). 

We may now define a notion of performance equivalence that does not depend 
on /: 

Definition 3.6. Two terms u and v are independent-performance equivalent 
(written u ^iv) if u v for any duration function f. 

is a congruence since it is an intersection of congruences. 

Remark 3. 1. A byproduct of our study is a proof that ~ / and coincide for any 
/ ('Gorollarv 15.8(1 . which we see as the reason why [GGR97] introduced both an 
/-performance and an independent-performance preorder (these two preorders 
do not coincide) but only one performance equivalence, and did not comment 
about this. However, since we cannot prove Gorollarv 15.81 without the technical 
developments of the next sections, we shall keep writing ~/ as long as necessary. 

□ 



4 Structural Congruence 

Here we introduce a structural congruence for TBPP. It allows us to exhibit a 
normal form for the terms that generalizes the usual normal form for 
BPP |GHM93j . 
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Definition 4.1. We denote by = the smallest congruence induced by the five 
equations of Proposition [M 

Clearly u = v implies u '^f v since is a congruence and it satisfies the five 
equations. Also, since = does not depend on f, u = v entails u ~i v. 

Definition 4.2. A term u G T is in normal form if it is some n\ \> X\ || ... || 
Uk > Xk (where the Xi’s need not be distinct, and where we allow Ui = 0 or 
k = 0). 

Using Proposition 13.51 any term can be rewritten to a structurally equivalent 
normal form. Moreover, this normal form is unique (modulo associativity and 
commutativity of ||). Sometimes we are only interested in the subterms “0> Aj” 
in a normal form and write it X\ || . . . || A„ |j 1 > it. 

The normal form of a term u displays all dates for which u can make an 
immediate step. A consequence is the very useful Lemma: 

Lemma 4.3. it Nil iff u = Nil iff minclock{u) = +oo iff maxclock{u) = 
— oo. 

5 Cancellation for Performance Equivalence 

In this section, we prove the surprising result that performance equivalence can 
be reduced to a notion of equality of normal forms. For this, we use a decomposi- 
tion approach along the lines that have been pioneered by IMM93I and which of- 
ten work nicely in timed or normed settings (see Prop. 30 in IAM96I or Prop. 2.2.8 
in IHen88h . 

The following lemma is the converse of Proposition 13.41 It emphasizes the 
link between the behaviours of the terms it and 1 c> it. 

Lemma 5.1. lc>it~/ l>i; entails u '^f v. 

Proof. Standard: one checks that TZ {(iti,it 2 ) | 1 > iti ~/ 1 > M 2 } is an /- 
performance equivalence. □ 

Given two TBPP terms u and v, we say that u is earlier than v if 
maxclock{u) < minclock{v) and m // Nil. A separated product is some m || m 
with u earlier than v. This syntactic notion is useful because when m || 1 ; makes 
a move at time n, it is possible to assign the move to m or m on the basis of n 
only. 

Lemma 5.2. Assume ui || M 2 and mi || V 2 are separated products s.t. u\ and 
Ml have same maxclock. Then mi || M2 ~/ mi || M2 entails (1) M2 ~/ M2 and (2) 

Ml Ml. 
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Proof. (1) is easy to see with the separation hypothesis. Let TZ be the set of all 
pairs (u,f) s.t. u\ \\ u vi \\ v and both u\ || u and v\ || v are separated. 
We show that 7?. U is an /-performance equivalence. Indeed, if u u' 
then Ml II u Ml II u' which is still separated (or u' ^ f Nil). Now there is a 
M II Ml t with Ml II u' t but this step can only come from v, so that t 
is some separated mi || v' (or m' Nil). Since mi and mi have same maxclock, 
u' Nil iffu' Nil and we have (m',m') G 7?.U 

(2) We now prove mi mi. Let TZ be the set of all pairs (m,m) s.t. u and v 
have same maxclock, and there exists a t s.t. m || t m || t and m || t and m || t 
are separated. We show 7?. U is an /-performance equivalence. Consider a 
pair (m, v) G TZ (via some t) and let K be the largest maxclock for all immediate 
successors of u and v. K is finite because TBPP has finite branching. Thanks to 
TjCmma, 12.81 there is a sequence w s.t. t — > t' and minclocklt') > K . 

Consider a step u u'. Now m || t m || P u' || t' . Then there must 
exist a M II t M II t" v' || t” with m || m || t" and v! || ~/ v' || t" . 

We have t' ~/ Nil iff t" Nil (because they have same maxclock) so that (1) 

gives us t' t” . Thanks to minclockft') > K, we have u' ^ f Nil iff m' Nil 
(because u' || t' and v' || t' have same minclock). Km'// Nil then both u' || t' 
and v' II t” are separated, so that (m',m') G TZ. Otherwise m' ^/ Nil ~/ m'. □ 

Of course, normal forms are separated in an obvious way. Hence: 

Lemma 5.3. Assume Xi || . . . || || ... || X!^, . Then m = m' and to 

any Xi we can associate a Xj s.t. Xi X^. 

Proof. Obviously m = m' since any maximal execution of Xi || ... || Xm has 
exactly m steps with date 0. Now pick actions afs s.t. Xi Ui. We have 
Xi II ... II Xm Xi II 1 > M. Then there is || ... || 

M with Xi II 1 [> M ~/ M. But V is reached by m — 1 steps 
at date 0 from X[ || ... || Xf,, hence it has the form Xj || 1 > m'. The previous 
lemmas entail X\ ~/ Xj (and m ^/ m'), which conclude the proof. □ 

Lemma 5.4. Assume Xi || ... || Xm X[ || ... || Xf,. Then there is a 
bijective h : [l..m] ^ [l..m] s.t. Xi for all i. 

Proof. We split the multiset {Xi, . . . ,Xm,X[,... ,Xf^} into the equivalence 
classes induced by ^/. If every class contains exactly as many Xfs as X^s, 
then h is easy to build. Otherwise we can assume w.l.o.g. that one class is 
{Xi,X 2 , . . . , Xp, X[,X 2 , . . . , Xg} with p < q. Assume Xi °‘’°> for all z’s, and 
consider w = (oi,0) . . . (ap,0). We have a move Ai || . . . || X^ li>w || Xp+i || 

. . . II Xm. This is imitated by A( || . . . || ^ 1 o m' || X[^^^ || . . . || X[^. 

Lemma 15.21 entails that Xp+i || ... || Xm ~/ || ... || X[^. Now one index 

(say j) in {ip+i, ... , *m} must belong to {1, . . . ,q}. This contradicts Lemma HTTH 
because we assumed A' has no match in Xp^i , . . . , Xm.. □ 
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As a consequence, we now have the following important result, reducing to 
“equality” of normal forms: 

Theorem 5.5. Assume u = rii t> Xi || ... || rim i> Xm and v = n'i\> X[ || ... || 
n^/ Then u v iff there is a bijective h : > [1..to'] s.t. Ui = 

and Xi X'm^ for all i. 

Hence /-performance equivalence of u and v can be reduced to a combination 
of /-performance equivalence of variables. 

An equivalence relation « between variables of X can be extended to terms: 
we say u k, v when the normal forms rii c> Ai || . . . and > A( || ... of u and v 
can be related by a bijective h s.t. rij = and Xi « X'^y 

Definition 5.6. An equivalence relation ks between variables of X has the trans- 
fer property if for any X ^ Y and for any X — u there is a Y — v s.t. 

U K V. 

Clearly, if « has the transfer property, then its extension to terms is an /- 
performance equivalence. Conversely, Theorem 15.51 implies that C]{X x X) 
has the transfer property. But the transfer property for some « does not depend 
on /. Hence 

Lemma 5.7. Let f and g be two duration functions. Then and coincide. 



Corollary 5.8. u v iff there is a duration function f such u v iff u ~i v. 



Remark 5.9. Corollary 15.81 calls for comments. It is not a paradox and can be 
compared, e.g., with Prop. 13 from |AM96j . Still, we see no easy way to prove 
it without going through the analysis required for our Theorem 15.51 

Observe that it does not hold if we allow duration functions taking the value 
zero (which is rather meaningless in our framework). E.g., the terms from Ex- 
ample [313] become performance equivalent when /(a) = 0. 

Similarly, it does not hold in a framework where we associate several values 
to a same action (cf. Bema.rk |2.4[l . E.g., with 

A = {X X,X -^2>X, y-^A,y-^l>A,y-^2[>A} 

we have A F but X Y when /(a) = {1,2}. □ 

As a consequence, we may write indistinctly ~ for any (and for ~i). We do 
that in the rest of the paper, where we assume additionally that / is the constant 
duration function 1. 
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6 Decidability of Performance Equivalence 

With the results from Section [5l deciding performance equivalence is simple 
since it amounts to computing the largest equivalence on variables that has the 
transfer property. 

Proposition 6.1. Computing ^ n(df x X) can he done in time polynomial in 

1 ^ 1 - 

(Where |Z\| is the number of rules plus the sum of the sizes of the left-hand 
sides.) 

Proof. Given A we partition the set X of variables into equivalence classes. This 
is done in the usual way, starting with ~o= X x X and refining Kii into 
until stabilization. The refinement step removes a pair (X, Y) from whenever 
there is a X u in Z\ s.t. no Y v has u v (which can be checked easily by 
a sorting algorithm when u and v are in normal form). Stabilization is reached 
after at most Ifbl — 1 refinement steps. □ 

Hence deciding whether u ~ can be done in time polynomial in |u|-|-|u|-l-|Z\|. 
Finally we have 

Theorem 6.2. Deciding performance equivalence over TBPP is P-eomplete. 

Proof. We already know membership in P and only prove P-hardness. 

When no parallel composition is involved, TBPP terms behave like finite- 
state processes where the single local clock just records the length of the his- 
tory of the computation. Hence performance equivalence of these sequential 
terms reduces to strong untimed bisimilarity of the underlying unfolded trees, 
which is just strong bisimilarity of untimed finite state processes, entailing P- 
hardness |BGS92] . □ 

7 Conclusion 

In this paper we investigated TBPP, a timed extension of the BPP. TBPP is 
essentially equivalent to the algebra of |GGB97| . itself obtained by forbidding 
synchronization in earlier process algebra with urgent durational actions. 

In this framework, [CGR97] introduced performanee equivalence as a way to 
relate processes having the same behaviour and the same efficiency. 

Our main result is a polynomial-time method for deciding performance equiv- 
alence over this class where systems can have an infinite number of different 
states (even disregarding time). Thus, BPP -|- Time turns out to be simpler 
than plain BPP, which is a surprising result. This suggests that timed exten- 
sions of related infinite-state algebra should be investigated and could well turn 
out to be simpler than their better-known untimed counterpart. Let us suggests 
some directions: 

1. Bisimulation of normed PA processes is decidable |H,T99| but appears quite 
complex. What about performance equivalence for PA 4-Time? 
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2. Decidability of observational equivalence (a.k.a. r-bisimulation) of BPP pro- 
cesses is an important open problem |Esp97 , lKM99j . What about observa- 
tional performance equivalence? (Adding r’s to TBPP can be done in sev- 
eral ways: e.g., they can model internal actions with null duration instead of 
abstracted-away actions with positive duration.) 

3. Most behavioural equivalences are undecidable on BPP processes [Hiit94| . 
What about BPP-|-Time? 
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Abstract. We define four families of word-rewriting systems: the pre- 
fix/suffix systems and the left /right systems. The rewriting of prefix 
systems generalizes the prefix rewriting of systems: a system is prefix 
(suffix) if a left hand side and a right hand side are overlapping only by 
prefix (suffix). The rewriting of right systems generalizes the mechanism 
of transducers: a system is right (left) if a left hand side overlaps a right 
hand side only on the right (left). 

We show that these systems have a rational derivation even if they are 
not only finite but recognizable. Besides these four families, we give sim- 
ple systems having a non rational derivation. 



1 Introduction 

A general approach to verify properties for systems is to decide whether for- 
mulas are verified by their transition graphs: systems with isomorphic transition 
graphs have the same properties. These graphs are in general infinite but we have 
a hierarchy of graph families: finite graphs, regular graphs, prefix-recognizable 
graphs, rational graphs. A family of infinite graphs has been defined in [MS 85j : 
the connected regular graphs of finite degree meaning that they have a finite 
number of non isomorphic connected components by decomposition by distance 
from any vertex. The regular graphs of finite degree are the transition graphs of 
pushdown automata (restricted to a rational configuration set) and are also the 
prefix transition graphs of finite word-rewriting systems jCa 90| : finite unions 
of elementary graphs of the form {u v).W = {uw vw | w G TV} 
where u, v are words and W is a, rational language. This family has been 
extended in [ Co 90| to all the regular graphs (or equational graphs) : the graphs 
generated by the deterministic graph grammars. A larger family is composed 
of the prefix-recognizable graphs [( la, fib] which are the prefix transition graphs 
of the recognizable word-rewriting systems: finite union of elementary graphs 
of the form {U V).W where U,V,W are rational languages. Finally, an 
even larger family of graphs is the set of rational graphs studied in [Mo Dflj : 
the graphs recognized by transducers with labelled outputs. Clearly, all these 
representations are heterogeneous and a central question is to find a simple and 
uniform specification for all these graphs. 
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A solution has been proposed in |CK 98j which considers the ‘Cayley graph’ of 
any word-rewriting system: the set of transitions u v if u,v are irreducible 
words, a is a letter and ua derives into v. To represent as Cayley graphs 
the regular graphs and the prefix-recognizable graphs, we translate the prefix 
(resp. suffix) rewriting of systems IBii 641 into the rewriting of particular sys- 
tems, called prefix (resp. suffix) systems. A system is called prefix (resp. suffix) if 
a left hand side and a right hand side are overlapping only by prefix (resp. suffix). 
To represent as Cayley graphs the rational graphs, we translate the mechanism 
of transducers into the rewriting of particular systems, called right systems. A 
system is called right (resp. left) if a left hand side overlaps a right hand side 
only on the right (resp. left). These systems yield a uniform characterization of 
all the previous families of graphs. 

In this paper, we show that these systems have a rational derivation: deriva- 
tion relation itself (the reflexive and transitive closure by composition of the 
rewriting) is recognizable by a transducer (a finite automaton where each label 
is a couple of words), and we can construct such a transducer in polynomial 
time. Such a result is general: a rational relation preserves rational and context- 
free languages, and the composition of rational relations remains rational. Many 
others properties are well known [Be 79j . [AB 88j . Furthermore the derivation 
is rational when the systems (left, right, prefix, suffix) are not only finite but 
recognizable (and false for rational systems). Finally, it appears that we can have 
a non rational derivation for the remaining families of rewriting systems, defined 
by overlapping between the left hand sides and the right hand sides. 



2 Rational and Recognizable Relations 

We present notations and basic properties for rational relations and recognizable 
relations. 

For any set E, we denote by #A its cardinal and by 2'® its powerset. Let N be 
the set of nonnegative integers and for any n € N, let [n] = {1, . . . , n} with 

[o] = 0. 

A binary (total) operation ■ on a set if is a mapping from ExE into E and we 
write a-b instead of - (a, b). A set M with a binary operation • on M is a monoid if 
• is associative: (a-b)-c = a\b-c) for every a,b,c € M , and has a (unique) neutral 
element l:aT = l-a = afor every a G M. The powerset 2^^ of M is a monoid 
for operation • extended by union to subsets: P Q — {a-b \ a G P A b G Q} 
for every P, Q Q M {1} is the neutral element. A subset P of a monoid M is a 
submonoid of M if P is a monoid for • of M : P-P C P and 1 G P. The smallest 
(for inclusion) submonoid of M containing P and called the submonoid generated 
by P, is the following subset P* = Uri>o = {1} P"+i = P”-P 

for every n. The subset P* is also called the reflexive and transitive closure by • 
of P. Note that (P*)* = P* and 0* = {!}. 
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We say that M is finitely generated if M = P* for some finite P. We say that M 
is free ii M = P* for some code P: there is no two factorizations in P* of a same 
element i.e. if oi. . .am = b\. . .bn for oi, . . . , am, bi, . . . , G P then m = n 
and Ui = Vi for all i. 

We say that M is free finitely generated ii M = P* for some finite code P. 

The set Rat{M) of the rational subsets of M is the smallest subset of 2^ con- 
taining the finite subsets of M and closed by the three operations U, •, *. 

We can also recognize the rational subsets by finite automata. 

Let P be a subset of M. A (simple oriented labelled) P- graph G is a subset 
of VxPxV where V is an arbitrary set. Any (s,a,t) of G is a labelled arc 
of source s, of target t, with label a, and is identified with the labelled tran- 
sition s — > t or directly s — > t if G is understood. We denote by Vg := 

G 

{ s I 3 a 3 t, s t y t s } the set of vertices of G. A graph is de- 
terministic if is a function for every a G P i.e. distinct arcs with the 

same source have distinct labels: r — > s A r — > t s = t. The set 

2VxP xv P* -graphs with vertices in TG is a monoid for the composition: 

GaH := { r ^ t \ 3 s, r ^ s A s t } for any G,H C VxP*xV] 

G H 

its neutral element is { s — s | s G fo } (in fact 2^^^ is the pow- 
erset monoid of the partial semigroup VxP*xV with the partial operation 
{r,a,s)a{s,b,t) = (r,a-b,t)). 

The relation — > denoted by => or simply by => if G is understood, is the 

G* G 

existence of a path in G labelled u G P*. The labels L{G, E, F) of paths from a 
set P to a set F is the following subset of P*: P(G, E,F) := { u G M | 3 s G 
E, 3 t G P, s t }; in particular 1 G P(G, P, P) when P n P yf 0. 

A P- automaton A is a P-graph G whose vertices are called states, with a subset 
I of initial states and a subset P of final states; the automaton recognizes the 
subset L{A) = L{G,I,F) of P*. An automaton is finite if its graph is finite. 
An automaton is deterministic if its graph is deterministic and there is a unique 
initial state. This permits to express a standard result on rational subsets: 
Given a subset P of a monoid M, Rat{P*) is equivalently 

the smallest subset of 2^ containing 0 and {a} for each a G P, and closed 

by U , • , * 

the set of subsets recognized by the finite P-automata 

the set of subsets recognized by the finite deterministic P-automata. 

Given monoids M and N, the cartesian product MxN = { (m,n) \ m G 

M A nGA^jisa monoid for the operation defined by ,n') = 

(to ■ ^ to', n n') for every to, to' G M and every n, n' G N. A relation R from 
M into is a subset of MxN and we write also u R v for {u,v) G R. In 
particular P is a rational relation if R belongs to Rat{MxN) i.e. R is rec- 
ognized by a finite (and deterministic) Mx Wautomaton: any transition 

is written simply . For any relations R, S from M into N, we have R-S = 
{ (to-to'j n-n') I TO P to' a n S' n' } and R* is the reflexive and transitive closure 

by • of P. As usual, we denote by R~^ = { {v,u) \ u R v } the inverse of P 
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and R{P) = { v \ 3 u G P, u R v } is the image by i? of P C M . In particular 
Dom{R) = R~^{N) is the domain of R and Im{R) = R{M) is the image of R. 
Note that for R G Rat(MxN), we have R~^ € Rat(NxM), Dom{R) G Rat{M) 
and Im{R) G Rat{N). 

Note that the family 2^^^ of binary relations on M coincides with the set 
2 Mx{i}xm (unlabelled) 0*-graphs: {u, v) coincides with the transition u — v. 
So 2^^^ is a monoid for the relational composition RaS = { (u, w) | 3 u, u Rv A 
V S w } for every R,SCM xM with the neutral element Id^ = { (u,u) \ u G 
M }. Furthermore for every R C MxM, R* = Un>o reflexive and 
transitive closure of R for o: = Id,^ and aR. 

Another family of subsets of a monoid M are defined by inverse morphism. A 
mapping h from M into a monoid is a (monoid) morphism if /i(l) = 1 and 
h{a-b) = h{a)-h{b) for every a,b G M. A subset P of M is recognizable if there 
exists a morphism h from M into a finite monoid N such that P = h~^{h{P)); 
we denote by Rec{M) the family of recognizable subsets of M. 

Recognizable subsets are also recognizable by automata. We say that a P-graph 
G is (source) complete if for every a G P, every vertex s € Vg is source of an 
arc labelled a: 3 t, s t. We say also that G is path- deterministic if G* is 
deterministic: is a function for every u G P* i.e. if r s and r t then 

s = t. Given a subset P of a monoid M, Rec{P*) is the set of subsets recognized 
by the path-deterministic and complete P-automata having a finite set of states. 
Another way to characterize a recognizable subset is by residual: 

P G Rec{M) { u~^P I u G M } is flnite 

where the set Q~^P = { v \ 3 u G Q, u-v G P } is the left residual of P by 
Q C M . We denote also by PQ~^ = {u\3v G u-v G P } the right residual 
of P by Q. The characterizations of the rational and recognizable subsets by 
automata permit to deduce usual facts: 

Rec{M) is a boolean algebra 

P n Q G Rat{M) for every P G Rat{M) and Q G Rec{M) 

R{P) G Rat{N) for every R G Rat(MxN) and P G Rec{M) 

Rec{M) C Rat{M) if M is finitely generated (McKnight theorem) 

Rec{M) = Rat{M) if M is free finitely generated (Kleene theorem) 

RGRec(MxN) P = Pi xQi for / flnite 

with Pi G Rec{M) , Qi G Rec{N) (Mezei theorem) 

We restrict now to rational and recognizable relations on words. Henceforth N 
is an alphabet i.e. a flnite set of symbols called letters. 

The set N* = { (m, . . . , a„) | n > 0 A oi, . . . , a„ G A^ } is a monoid for the 
concatenation operator: (ai, . . . , am).(bi, . . . , bn) = (oi, . . . , a™, 5i, . . . , 6„). 
Any element (oi, . . . ,a„) is written simply ai. . .a„ and called a word, and the 
neutral element () is denoted by £ and called the empty word. Note that a word 
u over N of length |u| G IN is a mapping from [|u|] into N represented by 
n(l). . .u(|n|) = u. The mirror of any word u is the word u = u(|n|). . .u(l). 
A language L is a set of words: L C N*. Let L={m|mGP} the mirror 

of any language L, and let R = { (u,v) \ u R v } the mirror of any binary 
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relation R on N* . When L and R are finite, we denote by \L\ = J2ugl I^I 
the length of L and by |i?| = X](u l“l + 1^1 length of i?; in particular 
\Dom{R)\ + \Im{R)\ < |i?|. Furthermore we denote by iV^ = { u{i) \ u G 
L A i G [|m|] } the alphabet of letters in L, and by iV^ = the 

alphabet of R. As N* is the free monoid generated by N, Rec{N*) = Rat{N*) 
is the set of languages recognized by the (deterministic and/or complete) N- 
automata. A rational relation on N*, i.e. an element of Rat{N* xN*), is a relation 
recognized by a finite A^*xA^*-automaton called a transducer. Furthermore R G 
Rec{N* xN*) if and only if ii = [J^^j PixQi for some finite / with Pi,Qi G 
Rat{N*); in particular G Rat{N* xN*) — Rec{N* xN*). Another remark 

is that R{P) G Rat{N*) for every R G Rat{N* xN*) and P G Rat{N*). A 
crucial property which is not true for any monoid product is the Elgot-Mezei 
theorem: Rat{N* xN*) is closed by composition. The family Rec{N* xN*) is also 
closed by composition, and more generally RuS , S uR G Rec{N*xN*) for every 
R G Rat{N* xN*) and S G Rec{N* xN*). Obviously Rec{N* xN*) is closed by 
mirror, and Rat{N* xN*) is also closed by mirror: for any fV*xA’*-graph G, we 

have L{G,E,F)~ = L{G,F,E) with G = { p \ q}. 



3 Rational Derivation 

We consider the word-rewriting systems (see for instance the survey |l).l DO] and 
fRO 98] ) associated with a language of admissible words such that any derivation 
between admissible words contains only admissible words. Like in [sT^ , we 
define several subclasses of rewriting systems by considering the overlappings 
between the left hand sides (of the rules) and the right hand sides, inside of the 
admissible words. We extract two families of systems, the right systems and the 
prefix systems, having a rational derivation even if the systems are recognizable 
iTheorems Id. 81 and Id. fill . By mirror, we obtain two others families of systems, 
the left systems and the suffix systems. Besides these four families, we give simple 
systems having a non rational derivation. 

A (word) rewriting system (i?, C) is a binary relation R on N* and a language 
C C N* of configurations (or admissible words). A system (i?, G) is respectively 
finite, reeognizable, rational if G is rational, and if R is respectively finite, 
recognizable, rational. The rewriting — > according to any system (R, C) is 

R, c 

— > := { (xuy,xvy) G GxG \ u R v A x,y G N* } 

R, c 

the application of R under any left and right contexts, but restricted to con- 
figurations. Furthermore the derivation — > according to (i?, G) is 

^ := { {uq, Un) G GxC |n>0A3ui,..., Un-l, Uq > Ui . . . Un-l — > Un} 

R,C R,C R,C 

the reflexive (restricted to G) and transitive closure of — > by composition i.e. 

R, C 

— ^ = U where = Id^ and a — > V n > 0. 

R.C R,C R, C R. C R,C R.C 
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Note that for any system {R, C), 

= ( — >) and 



= (-:)■ 



hence 



= (— ^) and 






When the configuration set C = N*, it can be omitted: we usually say that the 
relation i? is a rewriting system and we denote by — > its rewriting (instead 

R 

of — >) and by — > its derivation. Note that — > = — > n CxC . 

R,N* R R,C R 

Even if the configuration set C is rational, it is possible by restriction to C 
to control the rewriting of simple finite relations in order to get a non rational 
derivation. This is shown in the following example. 



Example 3.1 Consider the finite relation R = {{a,bd) , (b,c), (c, a)} and 
the rational configuration set C = [Jp^^q ^{a b ■ This finite system 

has a non rational derivation — ^ because the language ~^{ab) n ad*bd* = 

R.C R,C 

{ ad'^bd'^ I n > 0 } is not rational. 

However such a relation R is prefix and is left {R is right) as defined below (cf. 
Theorems 13.81 and 13.111) . and in particular its derivation — ^ is rational; it is 

R 

recognized by the following transducer: 




e/e 



with (i,j) e {(a,a) , (a,6d) , (a,cd) , (6,a) , (6,6) , (6,c) , (c,a) , (c,6d) , (c,c)}. 



Thus we introduce a general condition on the systems to be study. A system 
(i?, C) is stable if it satisfies the following condition: 

s r t A s,t G C => r G C 

R R 

Such a general condition is undecidable but there exists decidable sufficient con- 
ditions like the closure of C by rewriting: — !-(C) C C. In particular any 

R 

relation R (on N*) is stable. A basic property of any stable system is that its 
derivation is the restriction to the configurations of the derivation of its relation. 



Lemma 3.2 For any stable system (R,C), 



n CxC. 



For C rational, CxC is a recognizable relation, and Lemma [3.21 implies that 
if — ^ is a rational (resp. recognizable) relation then — ^ is a rational (resp. 

R R,C 

recognizable) relation. However we will give general families of systems (i?, C) 
such that — ^ is rational, but not containing {R, N*) in general. 



R, C 
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To study the rationality of derivation — ^ we consider the composition — > o — > 

R,C R,C R,C 

of two rewritings, and we examine the possible overlappings between the right 
hand side of the rule applied in the first rewriting, with the left hand side of the 
rule applied in the second rewriting. We extricate families of systems having a 
rational derivation by discarding the undesirable overlapping rules. 

It is easy to find simple finite relations having a non rational derivation. 



Example 3.3 For R = {{ab,aabb)}, ~^{ab) = { | n > 1 } ^ Rat{{a,b}*) 

R 

hence — > is not rational. Similarly the derivation of R~^ = {{aabb, ab)} is not 

R 

rational because — > = ( — These relations are strict-internals as defined 



below. 



We say that a system {R, C) is domain- strict-internal if 

3 s, t G N* 3 (w, xuy) , {u,v) € R , x,y ^ e A swt, sxuyt, sxvyt G C 
meaning that the following representation is allowed: 




which is decidable for (R,C) rational: 3s,tGN*, 

{R n xs-iCf-i) o n X s-icf-i) ^ 0 

Let us illustrate the significance of the configuration set C for this definition. 

Example 3.4 The relation R = {(e, o5)} is domain-strict-internal and its 
derivation is not rational because the language — ^(e)na*6* = { a”6” | n > 0 } 

R 

is not rational. On the other hand (i?, (ab)*) is not domain-strict-internal (and 
is stable) and — .{{e}x{ab)*) is rational. 

Similarly a system {R,C) is image- strict-internal if (R~^,C) is domain-strict- 
internal, meaning that the following representation is allowed: 




Finally a system is strict-internal if it is domain-strict-internal or image-strict- 
internal. 

Another notions of internal systems can be obtained by prefixity and suffixity. 
A system (i?, C) is domain-prefix-internal if 

3 s, t G N* 3 {w, uy) , (u,v) G R , y e A swt, suyt, svyt G C 

meaning that the following representation is allowed: 
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DoTn{R) I ^ e 
Im{R) I 

^ 

which is decidable for {R, C) rational: 

3 s, f G N*, {R n X s-iCf-i) o n X yf 0 

Similarly we say that {R, C) is 

image-prefix-internal a (i?“^,C) is domain-prefix-internal, 
domain-suffix-internal if (i?, C) is domain-prefix-internal, 
image- suffix-internal if is domain-prefix-internal. 

Finally a system is prefix-internal if it is domain-prefix-internal or image-prefix- 
internal. Similarly a system is suffix-internal if it is domain-suffix-internal or 
image-suffix-internal. Furthermore a system is domain-internal if it is domain- 
strict-internal or domain-prefix-internal or domain-suffix-internal. Similarly a 
system is image-internal if it is image-strict-internal or image-prefix-internal or 
image-suffix-internal. 

Note that it is again easy to find non-internal relations having a non rational 
derivation. 



Example 3.5 For R = {{ba,ab)}, the language — ^((a6)*) n a*b* is equal 

R 

to { a”6" I n > 0 } hence — > is not rational. Such a relation is together 

R 

left-overlapping and right-overlapping as defined below. 

We say that a system (i?, C) is left- overlapping if 

3 s, t G N* 3 (u, yz ) , {xy, v) G i? , x,y,z e A sxut, sxyzt, svzt G C 

meaning that the following representation is allowed: 

7^ £ 

C — I I 7^ e I Im{R) I ; 

I I Dom{R) \ £ \ I * C 

Let us verify that we can decide whether a rational system (i?, C) is left- 
overlapping. Let $ be a new symbol (not in N) . So 

C,={u$v\uv&C} = Id,, .{(£, %)}.Id„ (C) 

and = { (u$u, w) \ uv R w A v ^ e } = Id,, .{($, e)}.Id,^ o R 

are rational: G i?at((fV U {$})*) and R^ G Rat{{N U {$})* xN*). 

Then {R, C) is left-overlapping if and only if it satisfies the following decidable 
property: 3 s,t G N*, 

{Id,^.{e,$).Rn s-^Ct~^ xs-^Cfi-'^) a{R^.Id,^ D s~^Cfi-'^ x s~'^Ct-'^) yf 0 
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Similarly a system (R,C) is right- overlapping if (i? ^,C) is left-overlapping 
(which is equivalent to (R,C) is left-overlapping), meaning that the following 
representation is allowed: 



c 



7^ £ 




I Im{R) I ^ I ; 
! 7^ £ I Dom{R) f ! 

" C ^ 



c 



Finally a system is overlapping if it is left-overlapping or right-overlapping. 



We are ready to give stable systems such that the derivation can be done in- 
creasingly. 

Precisely, we denote by — >„ the rewriting of {R, C) at letter position n -I- 1 : 



xuy 



,xvy for every uRv and xuy^xvyGC with \x\ = n . 



This permits to define the following increasing derivation : 

* II ” 

(rt. “ Un>0 M 



where ^ = Id„ 

R.C ^ 

and ^ =[jp < o . . . o — for every n > 0 

with li-i=li => — fp. is only according to i? — {e}xA^*. 

R, c * 

This last condition means that the following derivation: 

xuy — >| 3 ;| xvy — xwvy with u R v and e R w 
is not increasing. In fact in this derivation, the rule e — > w is on the ‘left’ of 
the rule u — > v and must be applied before to give the following increasing 
derivation: 

xuy — >| 3 ;| xwuy — xwvy (assuming that w yf e). 
Lemma l3. 21 remains true for increasing derivations. 



Lemma 3.6 For any stable system (R,C), 



★ 

R, C 



^ n CxC. 



The increasing derivation coincides with the derivation for stable systems having 
no overlapping configurations where the domain begins before the image. 



Lemma 3.7 For any stable system (R,C) which is not left- overlapping, not 
image-strict-internal and not image-sujfix-internal, we have — ^ ^ . 

R.C R.C 

Proof. 

By definition, we have ^ C — . 

R.C R.C 

Let us prove the converse. As ^ , we may assume that (e, e) ^ R. 

R.C R - {e, e}, C 
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We show the following four inclusions: 



^ > n ° 


7^ n 




7^ n ° 


^ > n 






.t, -u, C — 


e,v' ,C 




e,v' ,C 


U, V, C 






^ > n ° 


, 7^ n 


C 


, 7^ n ° 


^ > n 


with 


u^v! ^ e 


X,V,C 


u ,v' ,C 




u' ,v' ,C 


u, V, C — 






^ > n ° 


, 7^ n 


c 


, 7^ n ° 


^ > n 


with 


Vi! , v' ^ e 


',V,C 


u ,v' ,C 




u' ,v' ,C 


e, V, C 






n ° 




c 


7 ^ n ° 


n 


u 




G 'f, C 


U ,e,C 




u' ,e,C 


£, C 


£, V, C 


u' ,e,C 



where — > p = Unep — ^ n for any integer subset P and with (u,f) G R. 

Using these inclusions, we sort increasingly any derivation by applying the bubble 
sort. 

□ 

A first class of rewriting systems with a decidable rational derivation is obtained 
by generalizing the mechanism of a transducer. Let (G, E, F) be a transducer: 
G is a finite A^*xA^*-automaton and we assume that its vertex set is 

disjoint of N. We convert G into the following relation: 

Rg = { {pu, vq)\p'^ q} 

in such a way that the language recognized by the transducer is obtained by 
derivation of as follows: 

L(G, E,F) = { {u, v) I pu — ^ vq A p G E A q G F } 

Such a system {R^, N*V^N*) is right meaning that it is not strict-internal, not 
domain-prefix-internal, not image-suffix-internal, and not left-overlapping. So a 
right system (i?, C) is a system where the overlapping configurations have only 
the following form: 




It is important to remark that relation R^ (on {N U V(,)*) is not right: we 
may have the overlapping configuration puwvq with puw G Dom{R^) and 
wvq G Im{R(^). 

The derivation — ^ of any finite right stable system (P, C) can be always 

R, C 

recognized by a transducer that we can construct from (P, G), and this can be 
generalized to any recognizable right stable system. 

Theorem 3.8 For any recognizable right stable system (R,G), the deriva- 
tion — ^ is an effective rational relation. 

R, G 

Proof. 

i) Let us reduce the proof of this theorem to G = N*. 

Let (P, G) be a recognizable system: P is recognizable and G is rational. 
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Furthermore we assume that {R, C) is stable and is right: it is not left- 
overlapping, not strict-internal, not domain-prefix-internal and not image-suffix- 
internal. 

Let $, # be two new symbols: ^ N. We consider the following system: 

S = { {x#y, $v) \ xy Rv} 

So S' = {{e,$)}.{Id^. a R) is recognizable. 

As S C N*#N* X $N*, S is a right relation (on {N U {$, #})*). 

We verify that 

^ = { {h^{s),h,{t)) eCxC\s^t} 

R,C S 

with for &€{#,$}, /i^ is the morphism from (7VU{&})* to N* erasing &: 
/i(&) = e and h{a) = a for every a G N. 

Thus 

^ = (({(£,#)} u/<)* 0^0 ({($,£)} u/<)* ) nCxC 

implying that — ^ is rational if — ^ is rational for the recognizable right 

R,C S 

relation S. 

ii) Let i? be a finite right relation. 

Let us construct from R a transducer to recognize — ^ . 

R 

Its finite set of states Q is 

Q = { w \ 3 x,y & N*, xw S Im{R) A wy G Dom{R) } 

which contains £. Its finite graph G is 
G = H U I 

where H = { w z \ w, z G Q A wx R yz A |x|, \y\ minimal } 

and I = { w wx \ w, wx G Q A |a;| > 1 minimal } 

\J { yz z \ z^yz G Q A |y| > 1 minimal } 

U { e — > e \ a G N } 

We take e as the initial state and as the unique final state. We show that the 
transducer (G, {e}, {e}) recognizes — ^ . 

R 

To implement I and as #N can be large, we use a new symbol • to designate 
any letter in N and the label ./. means any couple a/ a for a G N. 

The minimality of |a;| and |y| in H and / is useless but it permits to 
construct a graph isomorphic to G with a (worst case) complexity 0(|i?|) in 
space and 0{\R\'^#N^) in time i.e. linear in space and quadratic in time if 
we assume that the number of letters is a constant. 

iii) Let us extend {ii) to any recognizable right relation R. As R is recognizable, 

R = Uti HGi,n,R)xL{G',r',F') 
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where {Gi,ri, Fi)i<i<p and (G', r', are finite automata such that 

their vertex sets , V^, , - . . , , V^, are pairwise disjoint. 

Let us construct from R a transducer to recognize . 

R 

Let r be a new symbol. We define the following finite graph: 



G = 


= H 


U / 






with I = 


= { r 


a/a 1 

^ r 1 


a G iV } U 






{ r 


e/e 


\iG[p]} U { s' ^ r 1 


3 i G [p], s' G F/ } U 




{ s 


1 


3iG[p], s -/^t} U { 


s' \ 3iG [p], s' ^t'} 


and H = 


= { s 


e/e , 


\iG[p] A s G Fj } U 






{f 


e/e 
^ S 


1 3 i,j G [p], 3 M yf e, u 


G L{G'„t',F') n L{Gj,rj,s) } 



We take r as the initial state and as the unique final state. We show that the 
transducer (G, r, r) recognizes — > . 

□ 



Similarly a system (i?, G) is left ii (R,C) is right: (R,C) is not strict-internal, 
not domain-suffix-internal, not image-prefix-internal, and not right-overlapping. 
Note that (i?, G) is left if and only if (R~^,C) is right. Hv Theorem l,S.8l a,nd a,s 
the rational relations are preserved by mirror (or by inverse), the derivation of 
any recognizable left stable system is also an effective rational relation. Note that 
the condition for a right system to be not domain-prefix-internal is necessary to 
have a rational derivation. 



Example 3.9 For R= {($, a&) , (&, $5)}, — ^($)n{a, b, $}* = { a"$6" | n > 0 } 

R 

hence — > is not rational. This relation R is not overlapping (not left- 

R 

overlapping and not right-overlapping), is not strict-internal (not domain-strict- 
internal and not image-strict-internal), and is not image-internal (not image- 
strict-internal, not image-prefix-internal and not image-suffix-internal). In par- 
ticular, its derivation is increasing but the system is not right because it is 
domain-prefix-internal (and domain-suffix-internal) . 

A second class of rewriting systems follows from the relation of prefix rewriting. 
The prefix rewriting i — > of a system R is the restriction of the rewriting — !• 

R R 

obtained by applying the rules only by prefix: 

uw I — > vw for every u R v and w G N* 

R 

meaning that the prefix rewriting i — > is the relation R.Idj^, . The prefix 
derivation is the reflexive and transitive closure for the composition of the 

R 

prefix rewriting. Biichi has shown that the prefix derivation i — fu) of any finite 



60 



Didier Caucal 



relation from any word u is a rational language which can be constructed in 
exponential time |Bii Boasson and Nivat have extended this result: the prefix 
derivation of any recognizable relation i? is a rational relation |BN 84] 

and a transducer can be constructed in polynomial time |( ia, OOj . To adapt this 
result for rewriting systems, let us remark that for any system R, we have 
X 1 -^ y $a; %y 

R $R 

where $R = { ($u, $v) \ u R v } with $ a new symbol. 

We say that a rewriting system {R, C) is prefix if {R, C) is not overlapping 
(not left-overlapping and not right-overlapping), not strict-internal (not domain- 
strict-internal and not image-strict-internal), and not suffix-internal (it is not 
domain-suffix-internal and not image-suffix-internal). So a prefix system is a 
system where the overlapping configurations have only the following form: 



; 1 


Im(R) 


1 


! 1 


Dom{R) 


! * c 



n{n) i 

c I 



Note that (R,C) is prefix is equivalent to (R~^,C) is prefix. 

A usual finite prefix system is a (unlabelled) pushdown automaton i.e. a finite 
R C Q.PxQ.P* with the language C = Q.P* of configurations, where 
P + Q = N {N is partitionned into the stack alphabet P and the state alphabet 
Q); note that such a R (with C = N*) is also a prefix relation. 

The derivation of any prefix stable system is the restriction to the admissible 
configurations of the concatenation closure of the prefix derivation of its relation. 



Proposition 3.10 For any prefix stable system (R,C), we have 
^ n CxC. 

R,C R 

Proposition EUni permits to extend the rationality of the prefix derivation of any 
recognizable relation to the derivation of any recognizable prefix stable system. 



Theorem 3.11 For any recognizable prefix stable system (R, C), the deriva- 
tion — ^ is an effective rational relation. 

R, C 

Proof. 

By Proposition 13.101 it remains to show that is a rational relation for any 

recognizable relation R. This has been proved in |BN 84| . A short proof is due 
to J.-M. Autebert and follows from Corollary 3.3 of IUiT96l : 

1 -^ = I — > = S-Id^., for a recognizable relation S = [JJ.UixVi 

RS ^ » J- 

such that for N a new alphabet in bijection to N, we have 
ULi = ^({ uf\uRv}*) n N*N* 
where P = { (xx,e) | 2 ; G A' }. 
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Note that we can specify directly S from R : 

S = {(£,£)} u I 

R-1 R 

The construction of S hence of S.Id^, can be done in polynomial time. But 
we do not give a precise majoration of the order like for Theorem 13.1 II fHl. 

□ 

Similarly a system {R, C) is sujfix if {R, C) is prefix: {R, C) is not overlapping, 
not strict-internal and not prefix-internal. By Theorem 13.1 1 l and as the rational 
relations are preserved by mirror, the derivation of any suffix recognizable stable 
system is also an effective rational relation. 

Example E3] shows that we have non rational derivations for prefix finite systems 
which can be domain-suffix-internal or image-suffix-internal (systems which are 
not overlapping, not strict-internal, and not domain-suffix-internal or not image- 
suffix-internal) . This includes the basic systems |Se 93| even if they are not strict- 
internal, where a basic system is a not overlapping and not domain-internal 
system (the inverse of the system defined in Example 13. Dl is basic and not strict- 
internal) . 

Furthermore we cannot combine our two theorems as shown below by modifying 
slighty Example 13.91 



Example 3.12 The overlapping configurations of i? = {($,a&) , (&6, $66)} are 
only prefix (N*$bbN*) and right {N*a&zbN*) but — ^ is not rational because 

the language ^^($ 6 ) n {a, 6 , &}* = { a'^uh'^ | > 1 } is not rational. 

R 



Finally Theorems 13.81 and 13.1 II cannot be extended to respectively any rational 
right stable system and any rational prefix stable system as shown in the follow- 
ing example. 



Example 3.13 The relation R = { ($xu$,$u$x) \ u G {a, 6}* A a; G {o,6| } 
is rational and taking C = ${a, 6}*${a, 6}*, the rational system (R,C) has 
only domain-prefix-internal overlapping configurations. So (i?, C) is prefix and 
is left but 

n (${a, 6}*$)x($${a, 6}*) = { ($u$, $$m) I u G {a, 6}* } 

R, c 

is not a rational relation, hence — ^ is not rational. 

R, C 
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Abstract. We refine the simulation technique introduced in m to show 
strong normalization of A-calcuIi with explicit substitutions via termi- 
nation of cut elimination in proof nets m We first propose a notion 
of equivalence relation for proof nets that extends the one in j9j, and 
we show that cut elimination modulo this equivalence relation is termi- 
nating. We then show strong normalization of the typed version of the 
A;-calculus with de Bruijn indices (a calculus with full composition de- 
fined in ID) using a translation from typed Ai to proof nets. Finally, we 
propose a version of typed Xi with named variables which helps to better 
understand the complex mechanism of the explicit weakening notation 
introduced in the A;-calculus with de Bruijn indices [8]. 



1 Introduction 

This paper uses linear logic’s proof nets, equipped with an extended notion of 
reduction, to provide several new results in the field of explicit substitutions. It is 
also an important step forward in clarifying the connection between explicit sub- 
stitutions and proof nets, two well established formalisms that have been used 
to gain a better understanding of the A-calculus over the past decade. On one 
side, explicit substitutions provide an intermediate formalism that - by decom- 
posing the (3 rule into more atomic steps - allows a better understanding of the 
execution models. On the other side, linear logic decomposes the intuitionistic 
logical connectives, like the arrow, into more atomic, resource-aware connectives, 
like the linear arrow and the explicit erasure and duplication operators given by 
the exponentials: this decomposition is reflected in proof nets, which are the 
computational side of linear logic, and provides a more refined computational 
model than the one given by the A-calculus, which is the computational side of 
intuitionistic logicQ. 

^ Using various translations of the A-calculus into proof nets, new abstract machines 
have been proposed, exploiting the Geometry of Interaction and the Dynamic Alge- 
bras muisi, leading to the works on optimal reduction msiin]. 

J. Tiuryn (Ed.): FOSSACS 2000, LNCS 1784, pp. 63-FHTl 2000. 

(c) Springer- Verlag Berlin Heidelberg 2000 
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The pioneer calculus with explicit substitutions, Ao-, was introduced in [T] as a 
bridge between the classical A-calculus and concrete implementations of func- 
tional programming languages. An important property of calculi with explicit 
substitutions is nowadays known as PSN, which stands for “Preservation of 
Strong Normalization”: a calculus with explicit substitutions has PSN when all 
A-terms that are strongly normalizing using the traditional /3-reduction rule are 
also strongly normalizing w.r.t. the more refined reduction system defined using 
explicit substitutions. But Ao- does not preserve /3-strong normalization as shown 
by Mellies, who exhibited a well-typed term which, due to the substitution com- 
position rules in is not Ao--strongly normalizing 1181 . 

Since then, a quest was started to find an “optimal” calculus having all of a 
wide range of desired properties: it should preserve strong normalization, but 
also be confluent (in a very large sense that implies the ability to compose sub- 
stitutions), and its typed version should be strongly normalizing. 

Meanwhile, in the linear logic community, many studies focused of the connec- 
tion between A-calculus (without explicit substitutions) and proof nets, trying to 
find the proper variant or extension of proof nets that could be used to cleanly 
simulate /3-reduction, like in |Z]. 

Finally, in m, the first two authors of this work showed for the first time that 
explicit substitutions could be tightly related to linear logic’s proof nets, by pro- 
viding a translation into a variant of proof nets from uniii], a simple calculus 
with explicit substitutions and named variables, but no composition. 

This connection was promising because proof nets seem to have many of the 
properties which are required of a “good” calculus of explicit substitutions, and 
especially the strong normalization in the presence of a reduction rule which 
is reminiscent of the composition rule at the heart of Mellies’ counterexample. 
But |10j only dealt with a calculus without composition, and the translation 
was complex and obscure enough to make the task of extending it to the case of 
a calculus with composition quite a daunting one. 

In this paper, we can finally present a notion of reduction for Girard’s proof nets 
which is flexible enough to allow a natural and simple translation from David 
and Guillaume’s A;, a complex calculus of explicit substitution with de Bruijn 
indices and full composition |8]. This translation allows us to prove that typed 
A; is strongly normalizing, which is a new result confirming a conjecture in [8]. 
Also, the fact that in the translation all information about variable order is lost 
suggests a version of typed A; with named variables which is immediately proved 
to be strongly normalizing. This is due to the fact that only the type information 
is used in the translation of both calculi. Also, the typed named version of A; 
gives a better understanding of the mechanisms of labels existing in the calcu- 
lus. In particular, names allow to understand the fine manipulation of explicit 
weakenings in A/ without entering into the complicate details of renaming used 
in a de Bruijn setting. 



The paper is organized as follows: we first recall the basic definitions of linear 
logic and proof nets and we introduce our refined reduction system for proof nets 
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(Section [21), then prove that it is strongly normalizing (Section |2J. In Section 
ID we recall the definition of the A/ calculus with its type system, present the 
translation into proof nets, and show strong normalization of typed A;. Finally, 
we introduce a version of typed A; with named variables (Section!^, enjoying 
the same good properties, and we conclude with some remarks and directions 
for future work (Section |^. 



2 Linear Logic, Proof Nets, and Extended Reduction 

We recall here some classical notions from linear logic, namely the linear se- 
quent calculus and proof nets, and some basic results concerning confluence and 
normalization. 



MELL: Multiplicative Exponential linear logic Let ^ be a set of atomic formulae. 
We suppose that A is partitioned in two disjoint subsets representing positive 
and negative atoms respectively. 

The set of formulae of the Multiplicative Exponential fragment of linear logic 
(called MELL) is defined by the following grammar, where a€A: 

T a\T ® T (tensor) \T T (par) | \T (of course) | IT (why not) 

For every p G M, we assume that there is pJ € M, called the linear negation 
of the atom p. Linear negation of formulae is defined as follows 



p-*- = p' p'"*“ = p A^-^ = A (?A)-*' =!(A'*-) {A 0 B)-^ = A-^ 'S’ 



The name MELL comes from the connectors 0 and 'S’ which are called “mul- 
tiplicatives” , while ! and ? are called “exponentials”. We say that a formula 
is exponential if it starts with an exponential connector. While we refer the in- 
terested reader to [13j for more details on linear logic, we give here a one-sided 
presentation of the sequent calculus for MELL: 



h A,A^ 



Axiom 



\-r,A \-A^,A \-r,A 

\-r,A ^^\-r,?A 



r, A, B 
h r, A’^B 



Par 



h r, A h B, r' 

^ r, A® B,r' 



Times 



h r 
h r,?A 



h r, ?A, ?A 

Dereliction ^ ^ — Contraction 



W eakening 



h A,?r 

HA,7P 



Box 



MELL proof nets To all sequent derivations in MELL it is possible to associate 
an object called a “proof net”, which allows to abstract from many inessen- 
tial details in a derivation, like the order of application of independent log- 
ical rules: for example, there are many inessentially different ways to obtain 
h Ai^A 2 , ■ ■ ■ , An-i^An from h Ai,...An, while there is only one proof net 
representing all these derivations. 
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Proof nets are defined inductively by rules that follow closely the ones of the 
one-sided sequent calculus, and the set of proof nets is denoted PN. To simplify 
the drawing of a proof net, we use the following notation: a conclusion with a 
capital greek letter r,A,... really stands for a set of conclusions, each one with 
its own wire. 



Ax 



r 



A 

(d) 



r A B 



A’V B 



r 



(Axiom) 



(Dereliction) 











\ 










/ 


1 

r 


1 

A 

i_ 


1 

1 


1 

r 
























1 

r 


1 1 

?A 








\ 












?A 




r 
















\ 


> 



r 



(Par) 



?A 



B 



A® B 




(Cut) 



(Contraction) 



r 



(Times) 



(Weakening) 



(Box) 



Each box has exactly one conclusion preceded by a !, which is named “principal” 
port (or formula), while the other conclusions are named “auxiliary” ports (or 
formulae) . In what follows, we will sometimes write an axiom link as A A-^ . 

Reduction of proof nets Proof nets are the “computational object” behind linear 
logic, because there is a notion of reduction on them (called also “cut elimina- 
tion”) that corresponds to the cut-elimination procedure on sequent derivations. 
The traditional reduction system for MELL is recalled in Appendix 

Extended reduction modulo an equivalence relation Unfortunately, the original 
notion of reduction on PN is not well adapted to simulate neither the (3 rule 
of A-calculus, nor the rules dealing with propagation of substitution in explicit 
substitution calculi: too many inessential details on the order of application of 
the rules are still present, and to make abstraction from them, one is naturally 
led to define an equivalence relation on PN, as is done in [2] , where the following 
two equivalences are introduced: 
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Equivalence A turns contraction into an associative operator, and corre- 
sponds to forgetting the order in which the contraction rule is used to build, 
for example, the derivation: 



h?A,?A,?A 

— 1 _ 7 ^ ?yl — Contraction 
— — Contraction 



Equivalence B abstracts away the relative order of application of the rules of 
box-formation and contraction on the premises of a box, like in the following 
example. 



'^1AJA,B 



h?A,B 

vcaJb 



Contraction 

Box 



h?A,?A,B 
h? A, 7 A, IB 

— !i3 — Contraction 



Finally, besides the equivalence relation defined in [^, we will also need an extra 
reduction rule allowing to remove unneeded weakening links when simulating 
explicit substitutions: 




?A 7A 



This rule allows to simplify the proof below on the left into the proof on the 
right 

7T 



Weakening 
— — Contraction 



7T 

h?yl 



Notation We will call in the following R the system made of rules Ax — cut, 
'S’ — <8>, w — b, d — b,c — b, b — b and wc; we will name E the relation induced 
on PN by the contextual closure of axioms A and B] we will write Re for the 
system made of the rules in R and the equivalences in E] finally, will stand 

for system Re without rule wc. 
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Systems Re and that contain E, are actually defining a notion of reduction 

modulo an equivalence relation, so we write for example t — s if and only if 
there exist r' and s' such that r =e r' — s' =e s, where the equality =b is 
the reflexive, symmetric and transitive closure of the relation defined by A and B. 

The reduction Re is flexible enough to allow an elegant simulation of /3 
reduction and of explicit substitutions, but for that, we first need to establish 
that Re is strongly normalizing. Let us see this property in the next section. 

3 Termination oi Re 

We know from that is terminating, and we can show easily that wc 

is terminating too, so if we could show that the wc-rule can be postponed with 
respect to all the other rules of we would be easily done using a well-known 

abstract lemma. Unfortunately, there is precisely one case in which we cannot 
postpone the wc-rule: when a wc reduction creates an axiom-cut redex, which 
in turn can only happen if the axiom link in question introduces an exponential 
formula. So we are forced to proceed in two steps: first, we prove by postponement 
that Re is terminating on the set of proof nets without exponential axioms 
(Theorem [TJ. Then, we show that termination of Re on all proof nets of PN is 
a consequence of termination of Re on proof nets without exponential axioms 
(Theorem |2). To obtain this last result, we show how to translate a proof net R 
with exponential axioms into a proof net R' without exponential axioms in such 
a way that a reduction out of R can be simulated by a longer or equal reduction 
out of R'. 



3.1 Termination of Re on Proof Nets without Exponential Axioms 

We show in this section that all the iJ^-reduction sequences from a proof net 
without exponential axioms terminate. We first remind the following result 
from [^: 

Lemma 1 (Termination of R'^^). The relation — is terminating on 
PN. 



Then, we establish the termination of wc. 

Lemma 2 (Termination of wc). The relation — >wc is terminating on PN. 

Proof. The wc-rule strictly decreases the number of nodes in a proof net so no 
infinite wc-reduction sequence is possible. 



Finally, we show that given any proof net without exponential axioms, the 
wc-rule can be postponed with respect to any rule of 



Lemma 3 (Postponement of wc w.r.t 

out exponential axioms. If t — 



t- 



t'. 



Let t be a proof net with- 
then, there is a sequence 



WC 
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Proof. By analyzing all the possible cases. See m for details. 

We can now put together the previous results to prove termination of Re on 
the set of proof nets without exponential axioms. 

Lemma 4 (Extraction of Let S be an infinite sequenee of RE-re- 

ductions starting at a proof net t without exponential axioms. Then, there is a 
sequence of RE-reductions from the same proof net t which starts by t — 
t' , where t' is also a proof net without exponential axioms, and which continues 
with an infinite sequence S' . We write this sequence as (t — t') ■ S' . 

Now it is easy to establish the fundamental theorem of this section: 

Theorem 1 (Termination of Re on proof nets without exponential ax- 
ioms). The reduction relation Re is terminating on the set of proof nets without 
exponential axioms. 

Proof. We show it by contradiction. Let us suppose that Re is not terminating 
on those nets. Then, there exist a proof net without exponential axioms t and 
an infinite sequence S of Re starting at t. By applying Lemma E] to this se- 
quence S, we obtain a sequence (t — t') ■ S' such that S' is infinite again. 
If we iterate this procedure an arbitrary number times, we obtain a sequence 
of i?))'^'^-reduction steps arbitrary long. This contradicts the fact that RfP'' is 
terminating. 



3.2 Termination of Re on Proof Nets with Exponential Axioms 

We know now that Re is terminating on every proof net without exponential 
axioms, but we want now to show even more: termination of Re on all the 
proof nets. To achieve this result, we show in this section how to associate to 
a proof net t, which can eventually contain some exponential axioms, another 
proof net E(t) without exponential axioms, and such that every reduction from 
t of length n can be “simulated” on E{t) by another reduction of length at 
least n. This property will be enough to reduce termination of Re on proof nets 
with exponential axioms to termination oi Re on proof nets without exponential 
axioms. 

We define in what follows a notion of complete expansion on axiom links 
that is able to replace all exponential axiom by a correct net with the same 
conclusions, but containing no exponential axiom, and then extend it to a full 
proof net in the natural way (replace each exponential axiom by its complete 
expansion) . 

Definition 1 (Complete expansion of an axiom link). For each axiom link 
A A-*- we can associate a net exp{A A-*-) with same conclusions, defined 
by induction on the complexity of the formula A as follows: 
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— exp{A A^) = A A-^, if A is not an exponential formula 



— exp{\A lA-^) = 



\A 

which is well defined, because the formula A is smaller than \A. 

We can associate a complexity measure rk to a complete expansion. 

Definition 2 (Measure of a complete expansion). We define the measure 
rk of a complete expansion of an axiom by cases: 



— rk{exp{A = 0, if 7l is not an exponential formula 

— rk{exp{lA^ !tI)) = 1 + rk{exp{A 

We can now define the notion of expanded net E(t) for every net t: 

Definition 3 (Expanded net). The expanded net of a net t, written Eft), 
is the proof net obtained from t by replacing each occurrence of an exponential 
axiom a by exp{a). 



exp{ j 


1 ) 
1 




4 




1 



Remark 1. The only difference between a proof net t and its expanded net Eft) 
is on the set of their axioms. So, for every reduction t — t' which does not 
affect the axioms of t, there is a reduction E{t) — E{t'). 

We have now to show that there is no problem for the axioms either, and to 
do so we need the following measure: 

Definition 4 (Maximal distance of a cut). Given a proof net t and a cut 

link on a completely expanded axiom a in t, the measure d{a, t) is the maximal 
distance, in the proof net t, between this cut and the first weakening or dereliction 
node encountered in the way which leaves the cut, by the opposite extremity from 
the expanded axiom a, and go throw the nodes from down to up (here up and down 
are used formally for the orientation of the nodes presented in the introduction). 
More precisely, each node encountered and each box passed on the way values 1, 
including the final dereliction or weakening node. This measure is always finite 
on a finite proof net because there are no arbitrary long ascendant ways. 



Example 1. In the following net, the maximal distance of the cut is 4. 
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Lemma 5 (Cut elimination on an expanded net). Let t be an expanded 
net. A cut in t with a completely expanded axiom exp{a) reduces in t like in an 
ordinary axiom cut. In other words, 

Ax 



exp{ \A ?A-'- ) 



Cut 

Proof. We prove the property by induction on the lexicographic order 
{rk{exp{a)), d{exp{a) ,t)) where exp{a) is the completely expanded axiom in the 
proof net t. 

All the cases such that rk{exp{a)) = 0 (including the base case) correspond 
to a proof net in which exp{a) is an axiom link, so the same reduction rule 
applies and the property then trivially holds. For the cases with rk(exp{a)) > 0, 
we refer the interested reader to HD. 

This allows us to establish the final result of this section : 

Theorem 2 (Termination of Re)- The reduction Re is terminating on all 
proof nets. 

Proof. We establish this result by proving that each reduction step t — t' 
can be simulated by at least one reduction step Eft) — Re E{t'). 

If the reduction step t — >r^ t' does not reduce any exponential axiom with 
a cut, then we obtain the result immediately because the only difference between 
t and Eft) is on their axioms. Indeed, we can reproduce the same reduction on 
Eft) in order to obtain E{t') and this concludes this case. 

Otherwise, if t — >r^ t' reduces an exponential axiom a with a cut then by 
Lemma there exist a non-empty sequence of reductions starting at E(t) which 
eliminates the complete expansion of the axiom a, and gives the proof net E(t'). 

Now, to conclude the proof, suppose that there is a proof net t such that the 
reduction Re is not terminating on t, that is, there is an infinite i?E-reduction 
sequence starting at t. By the previous remark we can simulate this infinite 
reduction sequence by another Ag-reduction sequence on expanded proof nets 
not containing exponential axioms. This leads to a contradiction with Theorem |T] 
so that we can conclude that Re is terminating on the set of all proof nets. 
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4 Prom Xi with de Bruijn Indices to PN 

We now study the translation from typed terms of the Ai-calculus |8] into proof 
nets. We start by introducing the calculus, then we give the translation of types 
of A; into formulae of linear logic, and the translation of terms of A/ into linear 
logic proof nets PN. We verify that we can correctly simulate every reduction 
step of A/ via the notion of reduction Re- Finally, we use this simulation result 
to show strong normalization of the A/-calculus. 



4.1 The Aj-Calculus 

The A/-calculus is a calculus with explicit substitutions where substitutions are 
unary (and not multiple). The version studied in this section has variables en- 
coded with de Bruijn indices. The terms of A; are given by the following grammar: 

M ::= n I AM | (MM) | {k)M \ [i/M,j]M 

The term n is called a variable, AM an abstraction, (MM) an application, 
{k)M a labeled term and [i/M,j]M a substitution. 

Intuitively, the term {k)M means that the k — 1 first indices in M are not 
“free” (in the sense of free variables of calculus with indices). The term [i/N,j]M 
means that the i — 1 first indices are not free in N and the j — 1 following indices 
are not free in M . Those indices are used to split the typing environment of 
[i/N,j]M in three parts: the first (resp. second) one for free variables of M 
(resp. N), the third one for the free variables in M and N. 

The reduction rules of A/ are given in Figure |T] and the typing rules of A/ are 
given in Figure where we suppose that \P\ = i and |Z\| = j. 



(bi) 


(XMN) — > 


[0/W 0]M 




(&2) 


miXM)N) 


[0/N, k]M 




(/) 


[i/N,j]iXM) 


X[i + l/N,j]M 




(a) 


[i/N,j](MP) 


i[i/N,j]M)i[i/N,j]P) 




(ei) 


[i/N,j]{k)M 


(j + k — 1)M 


if i < 


( 62 ) 


[i/N,j]{k)M 


{k)[i-k/N,j]M 


if i > 


(ni) 


[i/N, j]k — > 


k 


if i > 


(na) 


[i/N, j]i — > 


{i)N 




(ng) 


[i/N, j]k — > 


H-k-1 


if i < 


(Cl) 


[i/N,j][k/P,l]M 


[k/[i-k/N,j]P,j + l-l]M 


if k< 


(C2) 


[i/N,j][k/P,l]M ^ 


[k/[i - k/N,j]P,l][i - 1 + l/N,j]M if i > 


id) 


{i){j)M 


(i+j)M 





Fig. 1. Reduction rules of A/ with de Bruijn indices 
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r,A,A'ri-. A 



Axiom 



r,A,n^[i/N,j]M -.B 



Subst 



A 'rM-.B 



Weak 



B'rM-.B^A B'rN-.B 



App 



B,r\-M:C 



r,A\-{i)M-.B r\-{MN):A r\-XM-.B^C 

Fig. 2. Typing rules for A/ with de Bruijn indices 



Lambda 



We notice that for each well-typed term of the A/-calculus, there is only one 
possible typing judgment. This will simplify the proof of simulation of A/ by 
easily considering the unique typing judgment of terms. 

As expected the A;-calculus enjoys the subject reduction property m- 

Theorem 3 (Subject Reduction). If W \- M : C and M — s- M' , then 
M' -.C. 

4.2 Translation of Types and Terms of Aj 

We use the translation of types introduced in given by : 

A* = A if A is an atomic type 

(A ^ B)* = ?((A*)-L) >s> IB* (that is, !A* ^\B*) otherwise 

Since wires are commutative in proof nets, we feel free to exchange them 
when we define the translation of a term. The translation associates to every 
typed term M of A;, whose type judgment ends with the conclusion written 
below on the left, a proof net having the shape sketched below on the right: 











M 




\ 







r \- M : A A* 

Here is the formal definition of the translation T from A/-terms into proof nets. 

— If the term is a variable and its type judgment ends with the rule written 
below on the left, then its translation is the proof net on the right 



T, A, A h i : A 



Axiome 




where i is the position of A in the typing environment. 

If the term is a A-abstraction and its type judgment ends with the rule 
written below on the left, then its translation is the proof net on the right 
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— If the term is an application and its type judgment ends with the rule written 
below on the left, then its translation is the proof net on the right 




— If the term is a substitution and its type judgment ends with the rule written 
below on the left, then its translation is the proof net on the right 



A,nhN:A r,A,nhM:B 
r,A,nh[i/N,j]M-.B 

where i is the length of the list B and j is the length of the list A, then its 
translation is the proof net 

— Finally, if the term is a label and its type judgment ends with the rule written 
below on the left, then its translation is the proof net on the right 




A^ M : B 
r,Ah {i)M : B 



Weak 




where i is the length of the list B, then its translation is the proof net 
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4.3 Simulating Aj-Reduction 

We now verify that our notion of reduction Re on PN simulates the A/-reduction 
on typed A/-terms. It is in this proof that we find the motivation for our choice of 
translation from A-terms into proof nets: with the more traditional translation 
sending the intuitionistic type A ^ B into the linear lA — ° B, the simulation of 
the rewrite rule / would give rise to an equality, not to a reduction step like in 
this paper. 

Lemma 6 (Simulation of A/). The relation Re simulates the Xi-reduction on 
typed terms: if t — t' , then T(t) — Re T{t'), excepted for the rules 62 and 
d for which we have T(t) = T{t'). 

Proof. The proof proceeds by cases on the reduction rule applied in the step 
t — >Ai t' ■ Since reductions A; and Re are closed under all contexts, we only 
need to study the cases where reduction takes place at the head position of t. In 
the proof, rule wc is used to simulate 62,61,711,712,713, equivalence A is used to 
simulate a, ci,C2, and equivalence B is used to simulate /, a, ci,C2. 

Due to space limitations, we cannot give here the full proof, which is fully 
developed in but we show anyway the case of rule ci , one of the composition 
rules: 



[i/N,j][k/P,l]M — > [k/[i — k/N,j]P,j + I — 1]M ii k <i < k + l 
Here, the typing judgment of [i/N,j][k/P,l]M must end with 

p',B,n,n' 'r p -.c p,c,n' 'r M ■. A 



A,n,n' ^ N : B 



P, P', B, n, n' h [k/P, l]M : A 



Suhst 



Subst 



p, r, z\, n, n' h [i/N, j] [k/p, i]m-. a 

while the typing judgment of [k/[i — k/N,j]P,j + I — 1]M must end with 
A,n,n' \- N : B P', B,n,W h P C 



P',A,n,n'\- [i-k/NJ]P:C 



Subst 



P,C,n' h M : A 



r, r, z\, n, n' h [k/[i - k/N,j]p,j + i-i]m-. a 



Subst 



So, the translation of the type derivation of the first term is 
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while the translation of the second derivation is 




To reduce the first proof net into the second one, we must eliminate the b — b 
cut, then apply the equivalence relations A and B. 

We are now able to show strong normalization of A/. To achieve this result, we 
use the following abstract theorem (see for example m) ■■ 

Theorem 4. Let R — {O, RiU R2) be an abstract reduction system such that 
i?2 is strongly normalizing and there exist a reduction system S = {O' ,R'), with 
a translation T of O into O' such that a — b implies T{a) — A T{b); 
a — ^ implies T{a) = T{b). Then if R' is strongly normalizing, Ri U R2 is 
also strongly normalizing. 

If we take O as the set of typed A/-terms, R\ as A; — {c2, d}, R2 as {e2, d}, O' as 
the set of proof nets and R' as the reduction Re, then, by the Theorem H] and 
the fact that the system including the rules {62, d} is strongly normalizing |H|, 
we can conclude : 

Theorem 5 (Strong normalization of A/). The typed Xi-calculus is strongly 
normalizing. 

5 The Aj-Calculus with Names 

In this section we present a version of typed A/ with named variables. We first 
introduce the grammar of terms, then the typing and reduction rules, and finally, 
we will briefly discuss the translation of this syntax to PN. 

The terms of this calculus are given by the following grammar: 

M ::= X I Xx.M \ {MM) \ AM \ M[x, M, P, A] 

The term x is called a variable, Xx.M an abstraction, {MM) an application, AM 
a labeled term and M[x, M, P, A] a substitution. 

Intuitively, the term AM means that the variables in A are not in M, and 
the term M[x,N, P, A] means that the variables in P do not appear in N (they 
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only belong to the type environment of M) and the variables A do not appear 
in M (they only belong to the type environment of 

Variables are bound by the abstraction and substitution operators, so that 
for example x is bound in Xx.x and in x[x, N, F, A], 

Terms are identified modulo a-conversion so that bound variables can be 
systematically renamed. Indeed, we have Ay.j/[x, z, 0, 0] =„ z, 0, 0] and 

A?/.y[a;, z,0,0] =„ Ay.y[a;', z, 0, 0] and ALy[a;, z, {?}, 0] =„ \l'.y[x,z,{l'},^. We 
remark that the conditions on indices used in the typing rules given in Section |4T] 
are now conditions on sets of variables. The typing rules are given in Figure 



r, X \ A \- X : A 



Axiom 



r\- M-.B^A r\-N-.B 
r \- (MN) : A 



App 



r\- M -.A rnA^0 
r,A\- AM : A 

r,x : Ah M ■. B 
Fh \x: A.M ■. B ^ A 



Weak 



Lambda 



A,Bh N -.A F,x ■. A,n h M : B {r,x ■. A) n A = Hi 
A,r,Bh M{x,N,r,A] : B 



Subst 



Fig. 3. Typing rules for the A/-calculus with named variables 



We remark that whenever T h M[x, N, A, 7T] is derivable, then F necessarily 
contains A and 77. 

As expected the A/-calculus with names enjoys the subject reduction property 
(See [TT] for a detailed proof). 

Theorem 6 (Subject Reduction). If F h M : C and M — > M' , then 
Fh M' -.C. 

We define the reduction rules only on typed terms, since we are focusing here 
on a named version of the typed A; calculus with indices. These rules already give 
the flavor of what a general notion of reduction for non-typed terms with names 
should be, but a precise formalization of the untyped case is left for further work. 

The reduction rules of the typed A/-calculus with names are given in Figure |4] 
(notice that rule h\ is a particular case of rule 62 with A — %). 

As customary in explicit substitutions calculi with names [3], we work mod- 
ulo Qf-conversion, so that we can suppose that in the rule Weak the set A does 
not contain variables that are bound in M. Also, this allows us to restrict rule 
/, without loss of generality, to the case where no variable capture arise. 

In order to translate a term of A/ into a proof net, we use exactly the same 
translation of types that we used in Section 14.21 and we then define the translation 
of a term M using the type derivation of M . 
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(&l) 


(Aa: : A.M)N — ^ 


M[x,N,0,0] 




(&2) 


{A{\x : A.M))N — > 


M[x,N,iD,A] 




(/) 


{\y.A.M)[x,N,r,A] 


\y : A.M[x, N,P + y, A] 


iiy^FV{N) 


(a) 


{MP)[x,N,r,A] 


{M[x,P,r,A]P[x,N,r,A]) 




(ei) 


AM[x, N, r, A] — ^ 


{Au{A\x))M 


X G A 


( 62 ) 


AM[x,N,r,A] — ^ 


(r n A)M[x, N,r\A,Au{A\ p)] 


X ^ A 


(ni) 


y[x,N,r,A] 


y 


yf=x 


(na) 


x[x, N, r, A] — > 


PN 




(ci) 


M[y,P,A,<P][x,N,r,A] 


M[y,P[x,N,r\A, A],A,Au(<I>\x)] 


X G$\A 


(ca) 


M[y,P,A,<P][x,N,P,A] 


M[x,N, {P\<P) + y,A] 








[y,P[x,N,P\A, A],A,rn<P] 


X A 


(d) 


PAM — ^ 


{PUA)M 





Fig. 4. Reduction Rules of the A/-calculus with named variables 



Since in proof nets there is no trace left of the order which is implicit in the 
formalism of de Bruijn indices, it comes as no surprise that the translation of A; 
with names into the nets is really the same than the one for A; (see m for full 
details) . 

The simulation of the reduction rules of the Aj-calculus with names by the 
reduction Re is identical to that given in Section 14.21 for the A/-calculus with 
indices. We just remark that rule ns has no sense in the formalism with names 
so that the proof has one less case. We just state the result without repeating a 
boring verification: 

Lemma 7 (Simulation of A/ with names). If t Xi-reduces to t' in the for- 
malism with names, then T(t) — Re T{t'), except for the rules C2 and d for 
which we have T{t) = T{t'). 

We can then conclude the following: 

Theorem 7 (Strong Normalization of A; with names). The typed A;- 
calculus with names is strongly normalizing. 

6 Conclusion and Future Works 

In this paper we enriched the standard notion of cut elimination in proof nets in 
order to obtain a system Re which is flexible enough to provide an interpretation 
of A-calculi with explicit substitutions and which is much simpler than the one 
proposed in PD]. We have proved that this system is strongly normalizing. 

We have then proposed a natural translation from A/ into proof nets that 
immediately provides strong normalization of the typed version of A/ , a calculus 
featuring full composition of substitutions. The proof is extremely simple w.r.t 
the proof of PSN of A; given in [S] and shows in some sense that A/, which 
was designed independently of proof nets, is really tightly related to reduction 
in proof nets. 
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Finally, the fact that the relative order of variables is lost in the proof-net 
representation of a term lead us to discover a version of typed A; with named 
variables, instead of de Bruijn indices. This typed named version of A/ gives a 
better understanding of the mechanisms of the calculus. In particular, names al- 
low to understand the manipulation of explicit weakenings in A; without entering 
into the details of renaming of de Bruijn indices. However, the definition of a gen- 
eral notion of reduction for non-typed terms with names remains as further work. 



This work suggests several interesting directions for future investigation: on 
the linear logic side, one should wonder whether Re is the definitive system 
able to interpret [3 reduction, or whether we need some more equivalences to 
be added. Indeed, there are still a few cases in which the details of a sequent 
calculus derivation are inessential, even if we did not need to consider them for 
the purpose of our work, like for example 



h 

h?H, T, B 



W eakening 
Box 



^ r,B 
h r,iB 
h?H, r, \B 



Box 

W eakening 



On the explicit substitutions side, we look forward to the discovery of a 
calculus with multiple substitutions with the same properties as A/, in the spirit 
of A^. 
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A Reduction of Proof Nets 



Reduction acting on a cut Ax — cut, removing an axiom : 

Ax I 

I I I 

A A-^ A A 

I I Ax-cut 



Cut 
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Reduction acting on a cut 'S’ — 0 ; 




Cut 



Reduction acting on a cut w — b, erasing a box : 




?A \A^ ?r 



Cut ' 

Reduction acting on a cut d — b, opening a box : 




Cut Cut 



Reduction acting on a cut c — b, duplicating a box : 




Reduction acting on a cut b — b, absorbing a box into another : 
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Abstract. We introduce a variant of the system of rank 2 intersec- 
tion types with new typing rules for local definitions (let- expressions 
and letrec-expressions) and conditional expressions (if-expressions and 
case-expressions) . These extensions are a further step towards the use of 
intersection types in “real” programming languages. 



1 Introduction 



The Hindley-Milner type system is the core of the type systems of mod- 
ern functional programming languages, like ML El, Miranda, Haskell, and 
Clean. The fact that this type system is somewhat inflexibly has motivated 
the search for more expressive, but still decidable, type systems (see, for in- 
stance, |1 n|l 4l4|2|S|7l9j L The extensions based on intersection types are partic- 
ular interesting since they generally have the principal typing propert'^, whose 
advantages w.r.t. the principal type propert^oi the ML type system have been 
described in [7]. In particular the system of rank 2 intersection types |1 on 411, 517) 
is able to type all ML programs, has the principal typing property, decidable 
type inference, and complexity of type inference which is of the same order as 
in ML. The variant of the system of rank 2 intersection types considered by 
Jim |7| is particularly interesting since it includes a new rule for typing recur- 
sive definitions which allows to type some, but not all, examples of polymorphic 
recursion m- 

In this paper we build on Jim’s work [7] and present a new system of rank 
2 intersection types, ^ which allows to give more expressive typings to 
locally defined identifiers (let-bound and letrec-bound identifiers) and to con- 
ditional expressions (we consider only if-expressions, but the technique can be 



^ In particular it does not allow to assign different types to different occurrences of a 
formal parameter in the body of a function. 

^ A type system has the principal typing property if, whenever a term e is typable, 
there exist a type environment A and a type v representing all possible typings of e. 

^ A type system has the principal type property if, whenever a term e is typable in a 
type environment A, there exists a type v representing all possible types of e in A. 



J. Tiuryn (Ed.): FOSSACS 2000, LNCS 1784, pp. 82-[S71 2000. 
(c) Springer- Verlag Berlin Heidelberg 2000 
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straightforwardly applied to case-expressions). These extensions are a further 
step towards the use of intersection types in “real” programming languages. 



Better Typings for Local Definitions. The system of simple types [5j assigns 
the same type to all the uses of an identifier. To overcome this limitation the 
Hindley-Milner type system considers a special rule to type local definitions, 
in this way locally defined identifiers (let-bound identifiers) are handled in a 
different way from the formal parameters of the functions (A-bound identifiers) . 

In practice let-polymorphism is also used to allow polymorphic use of globally 
defined identifiers. The key idea is that of handling an expression e which uses 
globally defined identifiers X \, . . ., a;„ like the expression letxi = ei in • • • leta;„ = 
e„ in e, in which the definitions of X\, . . Xn are local (and therefore available). 
This use of let-polymorphism to deal with global definitions has often been de- 
scribed as an odd feature, since it does not allow to typecheck global definitions 
in isolation. The problem can be identified with the fact that algorithm W re- 
quires as necessary inputs the type assumptions for the free identifiers of the 
expression being typed. Some solutions to overcome this limitation have been 
proposed in the literature (see for instance [lllLlj l. 

Systems with rank 2 intersection types can provide an elegant solution to this 
problem by relying on the principal typing property, see |2], and handling let- 
expressions “let a; = eg in e” as syntactic sugar for “(Aa;.e)eo”. In this way both 
locally defined and globally defined identifiers are handled as function formal 
parameters. However this strategy has a drawback: it forces to assign simple 
types to the uses of locally defined identifiers. For instance, the expression 

let 3 = A/.pair(/2) (/true) in ^(Ay.cons?/ nil) (1) 

cannot be typed, since to type o it is necessary to assign the rank 2 type 
((int ^ intlist) A (bool ^ bool list)) ^ (intlist x bool list) to the locally defined 
identifier g. 

In this paper we present a technique that, while preserving the benefits of 
the principal typing property of the system of rank 2 intersection types, allows 
to assign rank 2 intersection types to the uses of locally defined identifiers, by 
exploiting the fact that their definition is indeed available. As we will see, typing 
let-expressions let a: = eg in e by associating to the identifier x the principal 
type scheme of eg (which is a formula VL/.wo, where vg is a rank 2 type and 
iS are some of the type variables of vg) is not a good solution, since, when eg 
contains free identifies, it may happen that replacing a subexpression (Ax.e)eg 
with let a: = eg \n e does not preserve typability. To avoid this problem we will 
associate to x the principal pair scheme of eg (which is a formula V7/.(Ao, wo), 
where Ag is a type environment, vg is a rank 2 type, and 7/ are all the type 
variables of Ag and vg). 



Better Typings for Conditional Expressions. The ML type system handles 
an if-expression “if e then ei else 62” like the application “ifc e ei 62”, where ifc is 
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a special constant of principal type scheme Va.bool ^ a ^ a ^ a. If we apply 
this strategy to a system with rank 2 intersection types we are forced to assign 
simple types to the conditional expression and to its branches, ei and 62, and so 
the additional type information provided by intersection is lost. 

In this paper we present a technique that allows to overcome this limitation 
and to assign rank 2 intersection types to conditional expressions. For simplicity 
we consider only if-expressions, but the technique can be straightforwardly 
applied to case-expressions and functions defined by cases. As we will see, 
allowing to assign to an if-expression if e then ei else 62 any rank 2 type v that 
can be assigned to both ei and 62 will destroy the principal typing (and type) 
property of the rank 2 intersection type system. So, to preserve the principal 
typing property, we will introduce a condition that limits the use of intersection 
in the type v assigned to the branches e\ and 62 of the if-expression. 

Organization of the Paper. In Section E] of this paper we describe a simple 
programming language, that we call mini-ML, which can be considered the ker- 
nel of functional programming languages like ML, Miranda, Haskell, and Clean 
(the evaluation mechanism, call-by-name or call-by-value, is not relevant for the 
purpose of typechecking) . Section El introduces the syntax of our rank 2 inter- 
section types, together with other basic definitions. Section 2] presents the 
type system, which is essentially the extension to mini-ML of the type system 
hpR of [7j . In Sections O and El we describe two new type systems for mini-ML: 
l_Let,Rec^ which extends the system with more powerful rules for typing local 
definitions (let-expressions and letrec-expressions), and |-)(Let,Rec^ which extends 
the system with a more powerful rule for typing if-expressions. 

2 The Language Mini-ML 

We consider two classes of constants', constructors for denoting base values (in- 
teger, booleans) and building data structures, and base functions for denoting 
operations on base values and for inspecting and decomposing data structures. 
The base functions include some arithmetic operators, and the functions for de- 
composing pairs (prj^ and prj2) and for decomposing and inspecting lists (hd, tl, 
and null). The constructors include the unique element of type unit, the booleans, 
the integer numbers, and the constructors for pairs and lists. Let bf range over 
base functions (all unary) and cs range over constructors. The syntax of con- 
stants (ranged over by c) is as follows 

c ::= bf I cs 

&/ ::= not I and I or I -f | - | * | / | = | • • • | prj^ | prj 2 | hd | tl | null 

cs ::= 0 I true | false | • • • | — 1 | 0 | 1 | • • • | nil | pair | cons 

Expressions (ranged over by e) have the following syntax 

e ::= a; | c | A a;.e | ei 62 | if e then ei else 62 

I leta; = ei in 62 | letrec {xi = e\, ... ,Xn = e„} in e 
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where x, x\, . . x„ range over identifiers. The construct letrec allows mutually 
recursive expression definitions. Let FV(e) denote the set of free identifiers of the 
expression e. Expressions are considered syntactically equal modulo renaming 
of the bound identifiers. In order to simplify the presentation we assume that, 
in every expression, different bound identifiers have different names and that 
the names of bound identifiers cannot occur free in the expression (this can be 
always enforced by suitable renaming of bound identifiers) . 

3 Types, Schemes, and Environments 

In this section we introduce the syntax of our rank 2 intersection types, together 
with other basic definitions that will be used in the rest of the paper. 

Types and Schemes. The language of simple types (Tg), ranged over by u, 
is defined by the grammar: u ::= a \ unit | bool | int | m ^ m | m x m | m list. 
We have type variables (ranged over by a) and a selection of ground types and 
composite types. The ground types are unit (the singleton type), bool (the set 
of booleans), and int (the set of integers). The composite types are product and 
list types. 

The language of rank 1 intersection types (Ti), ranged over by ui, the lan- 
guage of rank 2 intersection types (T 2 ), ranged over by v, and the language of 
rank 2 intersection schemes (Tv 2 ), ranged over by vs, are defined as follows 

ui ::= ui A A Un (rank 1 types, i.e. intersections of simple types) 

V ::= u\ ui ^ V (rank 2 types) 

vs ::= yii.v (rank 2 schemes) 

where u ranges over the set of simple types Tg, n > 1, and o?' is a finite (possibly 
empty) sequence of type variables oi • • • am {m> 0). Note that Tg = Ti n T 2 . 
Let e denote the empty sequence. We consider Ve.v yf v, so T 2 n Ty 2 = 0. 

Free and hound type variables are defined as usual. For every type < G Ti U 
T 2 U Tv 2 let FTV(<) denote the set of free type variables of t. We say that a 
scheme vs is closed if FTV(us) = 0. 

To simplify the presentation we adopt the following syntactic convention: 
we consider A to be associative, commutative, and idempotent. Modulo this 
convention any type in Ti can be considered as a set of types in Tg. We also 
assume that for every scheme yii.v we have that { o?'} C FTV(v). 

A substitution s is a mapping from type variables to simple types which is 
the identity on all but a finite number of type variables. The domain, Dom(s), 
of a substitution s is the set of type variables {a \ s(a) a}. We use 
to range over substitutions whose domain is a subset of {o?'}. Note that, since 
substitutions replace free variables by simple types, we have that Tg, Ti, T 2 , 
and Tv 2 are closed under substitution. 

The following definition are fairly standard. Note that we keep a clear dis- 
tinction between subtyping and instantiation relations, and we do not introduce 
a subtyping relation between rank 2 schemes. 
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Definition 1 (Subtyping relations <i and <2)- The subtyping relations <1 
(C Ti X Tij and <2 T2 x T2) are inductively defined as follows 

— u <2 u, if u GTq 

— Ui A ■ ■ ■ A Un <1 u{ A ■ ■ ■ A u'^, if {mi, . . . , 2 {u {, . . . , u'^} 

— ui ^ V <2 ui' v' , if ui' <1 ui and v <2 v' . 



Definition 2 (Instantiation relations <V2,05 <V2,i and <V2,2)- The instan- 
tiation relations <v2,o Tv2xTo), <v2,i Tv2xTi), and <\f2,2 (^ TV2XT2) 
are defined as follows. For every scheme \Tc(.v G Tv2 and for every type 

0 . u G To, we write \Tce.v <v2,o u if u = s^-^j(?;), for some substitution 

1 . u\ A ••• A Un G Ti, we write VT?.?; <v2,i u\ A ••• A Un ifynt.v <v2,o Ui, for 
every i G { 1 , . . . , n}; 

2 . v' G T2, we say that v' is an instance of Vc?.?;, and write \Tct.v <V2,2 v' , if 
S{-^}(u) <2 v' , for some substitution 

For example, for vs = Vaia2<a3-((ai ^ crs) A (02 ^ 03)) ^ 0:3, we have 
(remember that A is idempotent) vs <v2,o (int ^ int) ^ int (by using the 
substitution Si = [ai,a2,a3 := int]) and vs <v2,i ((int ^ int) ^ int) A ((bool ^ 
bool) ^ bool) (by Si as above, and S2 = [ai,a2,cr3 bool]). We also have 
Vo. a — > a <V2,2 {cei A 02) ^ oi (by s = [a := ai] and <2). 



Type Environments. A type environment T is a set {xi : t\,...,Xn '■ tn} of 
type assumptions for identifiers such that every identifier x can occur at most 
once in T. We write Dom(T) for {xi, . . . ,Xn} and T^x : t for the environment 
TU {x : t} where it is assumed that x ^ Dom(T). In particular: 

— a rank 1 type environment A is an environment {xi : uii, . . . ,a;„ : uin} of 
rank 1 type assumptions for identifiers, and 

— a rank 2 scheme environment B is an environment {xi : vsi, . . . ,Xn '■ vsn} 
of closed rank 2 schemes assumptions for identifiers. 

For every type v G T2 and type environment T we write Gen(T, v) for the 
y -closure of v in T, i.e. for the scheme yUe.v where {'c?} = FTV(u) — FTV(T). 

Given two rank 1 environments Ai and A2 we write Ai + A2 to denote the 
rank 1 environment 

{x : uii A ui2 \ x : uii G Ai and x : m2 G A2} 

U{a; : uii G Ai j x ^ Dom(A2)} U {x : m2 G A2 \ x ^ Dom(Ai)} , 

and write Ai <1 A2 to mean that Dom(Ai) = Dom(A2) and for every assump- 
tion X : ui2 G A2 there is an assumption x : uii G Ai such that uii <1 ui2- 
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4 System Jim’s hpR Type System Revised 

In this section we introduce the type system for mini-ML. System h^®® 
is essentially an extension to mini-ML of the type system hpR of 0 (the lan- 
guage considered in |7] is a A-calculus without constants enriched with letrec- 
expressions) . Then, in Sections 0 and 0 we will extend h^®® with new typing 
rules for local definitions and conditional expressions. 

The Type Inference Rules. The type inference system h^®® has judgements of 
the form A; B h^®® e : v, where S is a rank 2 environment specifying closed Tv 2 
types for library identifier^, and A is a rank 1 environment containing the type 
assumptions for the remaining free identifiers of e. So FV(e) C Dom(yl U B), 
Dom(T) n Dom(B) = 0, and Dom(4) = FV(e) — Dom(Bj^ Note that (by 
definition of rank 2 scheme environment) FTV(B) = 0. 

We say that e is typable in F^®® w.r.t. the library environment B if there 
exist a typing A; B F^®® e : v, for some A and v. 

The type inference rules are presented in Fig. The rule for typing constants 
uses the function Typeof (tabulated in Fig. [TJ which assigns a closed scheme to 
each constant. Since, by definition, Dom(7l) contains exactly the assumptions 
for the free non-library identifiers of the expression e being typed, we have two 
rules for typing an abstraction Xx.e, corresponding to the two cases x G FV(e) 
and X ^ FV(e). The rule for typing function application, (App), allows to use a 
different typing for each expected type of the argument. The rule for typing if- 
expressions handles an expression if e then ei else 62 like the application ifc e ei 62 , 
where ifc is a special constant of type Va.bool ^ a ^ a — > a. A let-expression, 
let a; = eo in e, is considered as syntactic sugar for the application (Aa;.e)eo. 

The rule for typing letrec-expressions, letrec {cci = ei,...,Xn = e„}ine, 
introduces auxiliary expressions of the form reci{x\ = ei,...,a;„ = 6 ^}, for 
\ < i < n. These auxiliary expressions are introduced just for convenience in 
presenting the type system (reci {xi = ei,...,Xn = e„} is simply a short for 
letrec {x\ = e\, . . . ,Xn = e„} in Xi). 

The only non-structural rule is (Sub), which allows to assume less specific 
types for the free non-library identifiers and to assign more specific types to ex- 
pressions (for instance, without rule (Sub) it would not be possible to assign type 
(ai A 02) — ^ Oil to the identity function Xx.x). The operations of V-introduction 
and V-elimination are embedded in the structural rules. 

Comparison with the System FpR. Besides the presence of constants, if- 
expressions, and let-expressions, the main differences between F^®® and FpR are 
the presence of the library environment B (which is not present in FpR, although 
its use has been suggested in 0 ) and the improved typing rules for recursive 

I.e. for the identifiers defined in the libraries available to the programmer. 

® The fact that the environment A is relevant (i.e., x G Dom(A) implies x G FV(e)) is 

used in rule (Rec) of Fig. [2] (as explained at the end of Section l4)l. 
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c 


Typeof(c) 


c 


Typeof(c) 


0 


Ve.unit 


pair 


Vq:iQ:2.0:i — ^ 0:2 — ^ (oi 


true, false 


Ve.bool 


PUi 


Vcricr2.(eri X 0 : 2 ) — r cy.i 


not 


Ve.bool ^ bool 


nil 


\/a.a list 


and, or 


Ve.bool X bool — > bool 


cons 


Va.a ^ a list ^ a list 


■••-1, 0, 1,-. 


■ • Ve.int 


hd 


Va.a list ^ a 


+ 1 *1 / 


Ve.int X int — > int 


tl 


Vo?. a list ^ a list 


= ,<,■■■ 


Ve.int X int — » bool 


null 


Va.a list ^ bool 



Fig. 1. Types for constants 



definitions. For instance we have that reci {a;i = \ y.yy} can be typed in 
0. 0 pRec = Ay.yy} : ((oi ^ 02) A a\) 02, while it is not typable in 

hpR. This is due to the fact that, when typing a (possibly mutually) recursive 
definition reCio {^1 = eii ■ • • = e„}, the rules of hpR require that (also for 

those i G n} such that Xi ^ Ujg{i_,,,_„}FV(ej)) the rank two type Vi 

assigned to must be such that Gen(>l, Vi) <v2,i Ui, for some simple type Ui. 
This anomaly has been pointed out in where it is also described a solution to 
the problem in the case of a single recursive definition (reci{a;i = ei}): if X\ ^ 
FV(ei) then do not require that Gen(A, iii) <v2,i System generalizes 
this idea to mutually recursive definitions: the constraint Gen(A, Vi) <v2,i uii 
is enforced only for those i such that Xi G Ujg{i_..._„}FV(ej). 



Principal Typings for The type system F^®® has the principal typing 

property. The following definition and theorem are a formulation for F^®® (keep- 
ing in account the presence of the library environment B) of an analogous result 
for FpB presented in [3. 

Definition 3 (Principal typings for F^®®). A typing A'\B F^®® e ■. v' is 

an instance of a typing A\ B F^®® e : v if there is a substitution s such that 
Dom(s) = FTV(Fl) U FTV(i>), s(?;) <2 v' and Af <1 s(A). 

A typing A\ B F^®® e : v is a principal typing for e w.r.t. B if any other typing 
of e w.r.t. B is an instance of it. 

Theorem 1 (Principal typing property for F^®®). If e is typable in F^®® 
w.r.t. B, then it has a principal typing w.r.t. B. 



5 The System |-Let,Rec. Typings for Local 

Definitions 

Rule (LetSugar) of F^®® prevents us to assign rank 2 types to the uses of local 
definitions. The following rule, which allows to store the rank 2 type schemes 
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(Idi) {x : u}; B x : u x ^ Dom(_B) 



(ID 2 ) 0 -,B, X ■. h X : 



(Con) 0; B h c : Typeof(c) = \/~d.v 



(Abs) 



A, X ■. ui’, B \- e \ V 
A; B \- Xx.e ■. ui ^ V 



^ Oom(A U B) 



(App) 



A; B h e : Ml A- ■ - A Un^v (Vi£{l,. . .,n}) Aj- B \- ep ■. Vj Gen(Aj, <v2,o Uj 
A + Ai + • • • + A^; B h eeo ■ m 



(IfSimple) 



A; B h e : bool Ai; B h ei : m A2; B h 62 : m 
A + Ai + A2; B h if e then ei else 62 : u 



(LetSugar) 



A; B h (A x.e)eo : v 
A; B h let a; = eo in e : m 



A;Bh (Axi.---.Aa:„.e)e{---e; : v 
^ ' A; B h letrec {x\ = ei, . . . , = e-n} in e : m 

where, for i £ {1, , n}, e[ = reci {xi = ei, ... ,Xn = e„} 

(Vi £{!,.. . ,n}) At; B \- ej : Vj (Vj £ {ji, . . . ,jm}) Gen(A, Vj) <v2,i uij 

^ Ao; B h recio {a;i = ei, . . . , = e„} : Vig 

where {xj^, . . . ,%„} = {xi, . . . ,x„} n (Ui<i<„FV(ei)), 

A = Ao, Xji : riiji , . . . , Xj.^ : uij.^ = Ai H + A„, and io G {1, . . . , n} 



/Q 1 Ai; B I- e : vi A2 <1 Ai v\ <2 V2 
^ ' A2; B h e : M2 



Fig. 2. Type assignment rules (system 



inferred for local definitions in the environment B, has been suggested in jT] to 
overcome this limitation. 

/T ^0; B'r eo\ Vo A]B,x: Gen(Ao U B, wo) e' : v 

(LetWeak) Ao + A;Bhletx=eoine:M 

However the system (-LetWeak uses such a rule to type let-expression^ 

has an unpleasant feature: for some cq and e such that FV(eo) yf 0 , replacing 
(Ax.e)eo with leta; = cq in e may not preserve typability, as the following example 
shows. 

Example 1 . We have that {y : (oi ^ 02) A oi}; 0 |-LetWeak ^ .xx)y : 02 and so 
0 ; 0 Xy.{{Xx.xx)y) : ((oi ^ 02) A ai) 0:2. 

Instead Ay. (let a; = yina;a;) cannot be typed in pLetWeak^ 

® The rank 2 type vo may contain free type variables (which are not allowed to occur 
in the library environment in the judgements of So, if we want to use rule 

(LetWeak) instead of (LetSugar), we have to replace, in the type inference rules 
of Fig. El every occurrence of Gen(A, v) by Gen(A U B, v). 
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This problem is due to the fact that, like the ML type system does, rule 
(LetWeak) associates to each let-bound identifier a type scheme which, in gen- 
eral, cannot express the principal typing of the body, eg, of the local definition. 
To overcome this limitation we introduce the notions of pair scheme and pair 
environment. 

Definition 4 (Pair schemes and pair environments). A pair scheme p is 
a formula yif .{A, v) where A is a rank 1 environment, v is a rank 2 type, and 
-n = FTV(A) UFTV(w). 

A pair environment L is an environment {xi : V'of^.(Ai, wi), . . . , a;„ : 

?;„)} of pair scheme assumptions for identifiers. 

The New Typing Rules. The new typing rules for local definitions allow to as- 
sociate to each locally defined identifier a pair scheme representing the principal 
typing of its definition. The new type system uses an additional pair environment 
for locally defined identifiers (let-bound and letrec-bound identifiers), i.e. it has 
judgements of the form A; B] L g : where FV(e) C Dom(yl U B Li L), 

the domains of the three environments A, B, L, are pairwise disjoint, and 
Dom(A) = FV(e) - Dom(B U L). 

We say that e is typable in w.r.t. the library environment B and the 

local environment L if there exist a typing A; B; L |-)^®*’^®® g ■ foj- some A and 

V. 

The type inference rules for system |-)^®*’^®® are in Fig. E] There are two 
rules for typing a let-expression, let a: = eg in e, corresponding to the two cases 
X G FV(e) and x ^ FV(e). The key rule is the first one, (LetNew), which uses 
the local environment L to store a pair scheme (V‘c?.(^g, wg)) representing the 
typings of the local definition a; = eg. Then the rule (iDg) allows to associate a 
new typing to each use of a locally defined identifier. The new rule for typing 
letrec-expressions, (LetrecNew), simply relies on rule (LetNew). All the re- 
maining rules ignore the local environment L and behave as the corresponding 
rules of system F^®®. 

The system pLet.Rec ^oth h^®® and jg g^^j^ 

A]B-,L i-L®*'^®® (Ax.e)eg : v implies A; B~, L |-)^®*>^®® |eta; = egine : v, for 
all expressions eg and e. For instance (considering the expression in Example [1} 
we have 

1. {y : a}; 0 ; 0 h^®*’^®® y.a,hy rule (Idi), 

2. {y : ai ^ a2};0;{a; : Va.({y : a}, a)} x : ai ^ 02, by rule (ID3), 

with s = [a := ai ^ 02], 

3. {y : Oi}; 0; {x : Va.({y : a}, a)} |-)^®LR-®® (ID3), with s = [a := 

ai], 

4. {y : (oi — > 02) A oi}; 0; {x : \/a.{{y : a}, a)} p^®*’^®® xx : 02, from hypothe- 
ses (2) and (3), by rule (Apr), 

5. {y : (ai ^ a2) A oi}; 0; 0 p(^®*'^®® (let a; = y in xx) : 02, from hypotheses (1) 
and (4), by rule (LetNew), 
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(Idi) {a; : tt}; B\ L \- x \ u where x ^ Dom(_B U L) and Dom(i3) n Dom(L) = 0 
(ID2) 0; -B, X : y~d.v, L\- X : (v) where (Dom(B) U {a;}) Pi Dom(Z/) = 0 

(ID3) s^-^y{A)-,B-,L, X : y~d.{A, v) \- X ■. where Dom(B)n(Doni(I/)u{a;}) = 0 

(Con) 0; B; L h c : where Dom(B) n Dom(B) = 0 and Typeof(c) = \/~d.v 



A, X ■. ui\ B: L \- e : V 



<*«) .-tlfeThaiLlL, * I* Don'l'l U B) 

, ^ ^ A-,B-,L\- e : ui A - ■ - A u„^v (Vi € {1,. . n}) Ai',B;L\- : Vi Gen(^i, Vi) <v 2 .o Ui 

T + Ti + --- + Vl„;B;Lh eeo : 

ITpqTMPTpi ^;B;Bhe:bool Ar, B-, L \- ei : u A2; B- L \- 62 : u 
^ ' T + Ti + VI 2 ; B; L h if e then ei else 62 : u 

(LetNew) ^0; B; L h eo : vo A; B; L,jc : 'i'd .{Ao, vo) h e ■. v ^ ^ 

where { c?} = FTV(Ao) U FTV(«;o) 



Ao; B; L h eo : t'o A\ B\ L \- e ■. v 



(letvac) 



® ^FV(e) 



let a; = eo in e : D 

„ , yl; B; L h let a;i = e( in • • • let a;„ = e). in e : 

(LetrecNew) ^.j;.^Metrec{a:r = er,...,x„ = e4ine:. 
where, for i e {1, . . . , n}, e' = reci {a;i = ei, . . . , = e„} 



(Vi £ { 1 , . . . ,n}) Aj-,B-,L'^ a ■. Vi (Vj £ {ji, ■ ■ ■ ,jm}) Gen(yl, t>j) <v2,i Mij 
^ To; B; L h reci„ {*1 = ei, . . . , a;„ = e„} : Vi^ 

where {xj^, . . . ,%„} = {a;i, . . . ,*„} C (Ui<i<„FV(ei)), 

A = To, Xjy : uij^ Xj^ : uij^ = Ti H + T„, and io £ {1, . . . , n} 

/Qyg\ Ti; B; L \- e : rii T 2 <1 Ti vi <2 V 2 
^ ' T 2 ; B-, L e : V 2 



Fig. 3. Type assignment rules (system i-^et.Rec^ 



6. 0; 0; 0 Ay. (let a; = y in xx) : ((oi ^ 02 ) A oi) ^ 02 , from hypothesis 

(5), by rule (Abs). 

The following example shows another application of rule (LetNew). 

Example 2. The expression e = ( let y = Xfx.f(fx) in y(A y. cons y nil) ) 

cannot be typed neither in ML nor by system With system 

stead, we have 

1. 0;0;0 l-Let.Rec ^ Q,^) /\ (q, 2 ^ Q-g)) ^ ai ^ 03, 

2. 0; 0; 0 A y. cons y nil : a ^ (a list), 
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3 . 0 ; 0;{5 : Vaia2a3.(0, ((ai ^ 012) A (02 ^ as)) ^ ai ^ 03)} g : 

{{a' -I- (a' list)) A ((a' list) ^ (a' list list))) a' ^ {a' list list), by rule (ID3), 
with s = [a\ := a\ ■= a' list, := a' list list], 

4 . 0 ; 0;{5 : VaiO; 2 a 3 -( 0 , ((ai ^ ^2) A (02 ^ 013)) ^ ai ^ a 3 )} 

g(A j/xonsy nil) : a' — > (a' list list), from hypotheses ( 2 ) and ( 3 ), by rule 
(App), 

5 . 0 ; 0;0 g ; q,' (q-' list list), from hypotheses ( 1 ) and ( 4 ), by rule 

(LetNew). 

Also the expression 0 of Section [I] can be typed with 



Principal Typings for xhe following definition and theorem gener- 

alize the corresponding result of Section E] by keeping in account the presence of 
the local environment L. 

Definition 5 (Principal typings for h^®*’^®®). A typing A';B;L g ■ 

v' is an instance of a typing A; B; L e : v if there is a substitution s 

such that Dom(s) = FTV(A) UFTV(i'), s(w) <2 v' and A! <\ s(A). 

A typing A; B] L pLet.Rec ^ ■ y j^g principal typing for e w.r.t. B and L if any 
other typing of e w.r.t. B and L is an instance of it. When L = % we say that 
A; B\ 0 y is a principal typing for e w.r.t. B. 



Theorem 2 (Principal typing property for l_Let,Re®)^ // 6 is typablc in 
l_Let,R®® yj yg-^ Q then it has a principal typing w.r.t. B and L. 



6 The System |-^^Let,Rec. Typings for if-Expressions 



The rule (IfSimple) of and |-^®ARec seems overly restrictive: it does not 
allow to assign rank 2 types to the branches of if-expressions. We may think to 
replace that rule by the following rule 



(IfStrong) 



A] B] L \- e '. bool Ai; B]L\- e\ : v A2; B; L \- C2 : v 
A -I- Ai -I- A2; B] L\- \f e then ei else 62 : v 



which allows to assign a rank 2 type to the branches of an if-expression. How- 
ever the resulting system, does not have neither the principal typing 

property nor the principal type property, as the following example shows. 

Example 3. Take the expressions ei = Xf .f3, e^ = A g .prj]^(g(pair 1 4 )), and 
eo = if z then d else 62. We have 

— 0; 0; 0 g^ : y.^^ where v\ = (int ^ int) ^ int, and 

— 0; 0; 0 p)^®*'^®® g2 : 112, where V2 = ((int x int) ^ (int x int)) — > int. 
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but the expression eg cannot be typed by using |-Let,Rec^ Instead with system 
l_ustrong have: {z : bool}; 0;0 (-Strong where vq = ((int ^ int) A 

((int X int) ^ (int x int))) ^ int is the least upper bound of vi and V2 w.r.t. <2- 
Also the expressions eg 62 and eg 6162, where eg = Aa;.if zthen ei else a: and eg = 
Axi a;2.if zthen elsex2, are not typable by and typable by 

For instance we have 

— {z : bool};0;0 g' ; where Vq = {a ^ int) ^ ((int ^ int) A 

a) — > int (and also {z : bool}; 0;0 g' . ^ ^ vq, so {z : 

bool}; 0 ; 0 |-}^®trong 

- {z : bool}; 0; 0 g// (go . bool}; 0; 0 p^trong 

eg ei e 2 : wo)- 

Note that in there is no principal type for eg (the “natural candidate” 

isit = a^Q;— >a, but there exists no substitution s such that s(m) <2 
fi ^ W 2 ^ *^o)- This problem is due to the fact that the rank 2 schemes cannot 
express the fact that, because of rules (Sub) and (IfStrong), a type v can 
be assigned to an if-expression if e then ei else e 2 if and only if it is the upper 
bound w.r.t. <2 of a pair of types vi and V 2 that can be inferred for ei and 62 , 
respectively. 

In order to preserve the principal typing property of we restrict 

rule (IfStrong) by limiting the use of intersection in the type assigned to 
the branches of an if-expression. The condition that we will use to restrict rule 
(IfStrong) is based on the notions of /\-index and -^-index of a type and of 
index of an expression. 

Definition 6 (A-index and — s-index of a type). For every type v (iT 2 the 

A-index of v, Ind^(w), and the —s-index of v, Ind'^(w), are the natural numbers 
defined in Fig. [2 (note that Ind^(w) < Ind'^(w) }. 

The fundamental properties of the metrics Ind^ and Ind^ are expressed by the 
following propositioiu. 

Proposition 1. 1. //Ind^(i>) = p and Ind^(i>) = p + q (p,q > 0), then v 

is of the form v = ui\ ^ — s- uip u\ ^ ^ Uq ^ u for some 

uii,---,uip G Ti, ui,---,Uq,u € Tg, uip ^ Tg, and u not of the form 
u' u" . 

2. For every substitution s, Ind^(i>) > Ind^(s(i>)) and Ind^(ts) < 

Ind^ (s(f)). 

3. If V <2 v' then Ind^(w) < Ind^(w') and Ind^(w) = Ind^(w'). 

Definition 7 (Index of an expression). An index environment I is an envi- 
ronment {xi : ii, . . . ,Xn ■ in} of natural number (index) assumptions for identi- 
fiers. For every expression e and index environment I such that FV(e) C Dom(I), 
the index of e in I, Ind(e,I), is the natural number defined by the clauses in 
Fig. 0 



7 



Remember that A is idempotent, so, for any u € To, the type m A n is considered to 
be an element of To . 
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Ind^(ri) = 0, for m £ To 

Ind^ {ui ^ d) = 1 + Ind^ (ti), for ui — > d £ Ti — To 

Ind~'(unit) = Ind"'(bool) = Ind~'(int) = Ind^(rii x U 2 ) = Ind^(nlist) = 0 
Ind~(wi — > n) = 1 + Ind'^();) 



Fig. 4. The functions Ind^(?;) and Ind^(?;) 



The fundamental property of the metric Ind is given by the following proposition 
(the proof is by structural induction on e, the only non-trivial case is the com- 
putation of the index of the auxiliary expressions rec^ {xi = ei, . . . ,Xn = e„}). 

Proposition 2. If A]B\L p^et.Rec ^ ^ Ind(e,{a; : 0 | a; £ FV(e)}) < 

Ind^(?;). 

This implies that every h^®*’^®'^-typable expressions e has an index and, if 
Ind(e, { a; : 0 I a; £ FV(e)}) = i, then e is a function that can accept at least 
i argument^ The indexes of the open subexpressions of a closed expression e 
are computed by associating index 0 to formal parameters of functions, and 
the index of the corresponding definition to locally defined identifiers. For in- 
stance: Ind(y, {?/ : 0}) = 0, Ind(A?/.y, 0) = 1, and Ind(5, {5 : 1}) = 1, so 
Ind(let5 = (Xy.y)\ng, 0) = 1. For an example involving mutually recursive 
definitions, take the auxiliary expression e = reci {a;i = \ w.X 2 {w + 1), X 2 = 
\yz.\^(y > z) then 1 else z * {x\yz). We have lnd(e,0) = 2 (note that this re- 
quires two iterations of the while-loop in Fig. |5j. We remark that the clauses 
in Fig. |5]are just a specification, and do not represent an efficient algorithm for 
computing the index of an expression. 

The New Typing Rules. Let (IfNew) be the restriction of rule (IfStrong) 
requiring that the rank 2 type, say v, assigned to the branches of an if-expression, 
if e then ei else 62, must satisfy the condition 

Ind^(t;) < Ind(if e then ei else C 2 , {x : 0 \ x G FV (if e then ei else 62)}). 

By using rule (IfNew) instead of (IfSimple), it is possible to assign types vq, v' 
and vq to the expressions cq, eg and 6962 of Example^ respectively. Instead, it is 
not possible to assign type v\ ^ V 2 ^ vq to the expression e'f , so the expression 
e'o ei 62 cannot be typed. 

In order to compute more accurate indexes for expressions involving free 
library and local identifiers we introduce indexed environments, which allow to 
store the indexes of library and local definitions. 

® This does not hold for non-h)(2*’^*'^"typable expressions (unless we restrict to expres- 
sions not containing conditionals). Take for instance e = A a; .if a: then 0 else Xy .y. We 
have lnd(e,0) = 2, but etrue is not a function. 
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Ind(c,I) = Ind^(Typeof(c)) 

Ind(a;, I) = i if 1 : i € I 
Ind(A x.e, I) = Ind(e, I U : 0}) + 1 

, T\ _ / 0 if Ind(ei,I) = 0 

Ind(eie 2 ,I) - |ind(ei,I) - 1 if Ind(ei,I) > 1 

Ind(if e then ei else 62 , 1) = Max(Ind(ei, I), Ind(e 2 , 1)) 

Ind(let a; = eo in e, I) = Ind(e, I U {a; : Ind(eo, I)}) 

Ind(letrec {a;i = ei,. . a;„ = 6 n} in e, I) = Ind(e, I U {a:i :Ind(e{ , I ),. . .,Xn :Ind(e4, 1)}) 

where, for i € { 1 , , n}, el = reci {a;i = ei, . . . , a:„ = e„} 
Ind(reci {a:i = ei, . . . ,a:„ = e„},I) = begin 

(fci,...,fcn) := ( 0 , ..., 0 ); 

~ (Ind(ei,I),...,Ind(e„,I)); 
while {ji, . . . ,jn) {ki, . . . , k„) do begin 

(/l 1 , . . . , kn) •— (jl, . . . ^jn)'t 

I' — lu {a:i ■. jl,. . . ,Xn ■■ in}; 

(jl, • • -Jn) ■- 

(Ind(ei,T), . . . , Ind(e„,T)) end 
return ji 
end 



Fig. 5. The function Ind(e,I) 



Definition 8 (Indexed environments). An indexed rank 2 scheme envi- 
ronment B is an environment {xi : {ii,\/lf^ .vi), . . . , Xn : (i„, V'ai’".t;„)} of 
index and closed rank 2 scheme assumptions for identifiers such that, for every 
j G n}, ij < Ind^(?;j). 

An indexed pair environment L is an environment {x\ : 

(ii, wi)), . . . , ?;„))} of index and pair scheme 

assumptions for identifiers such that, for every j G {1, . . . , n}, ij < Ind^(wj). 

For every environment T containing rank 1, indexed rank 2, and indexed pair 
assumptions define Ind(T) = {x : 0 \ x : ui £ T} U {x : i \ x : {i, v) G T} U {x : 
i \ X : {i,p) G T}. Let (-(^fTet.Rec extension of which uses (in 

the typing judgements of Fig. EJ library indexed environments and local indexed 
environments, uses (see Fig. EJ rule (IfNew') instead of (IfSimple) and rules 
(LetNew'), (ID 2 ), and (iDg) instead of the corresponding rules of Rule 

(LetNew^) computes and stores the indexes of local definitions in the local 
environment in order to allow to compute more accurate indexes, while rules 
(ID 2 ) and (iDg) simply ignore the indexes and behave as the corresponding rules 
of Rephrase of Proposition E] holds also for system |-^fTet,Rec^ 

Proposition 3. If A'B'L (-(^fTet.Rec ^ ^ Ind(e,Ind(yl U R U L)) < 

Ind^{v). 

This guarantees that the indexes required in rules (IfNew^) and (LetNew^) in 
Fig. El (which involve only typable expressions) are always defined. 
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A\ B-, L \- e \ bool Aij B-, L \- e\ : v A2; B\ L \- 62 '■ v 
/T /N < Indfif e then ei else 62, Ind((A + + A2) U B U L)) 

) ^ + ^^+^,.j;.j:Mfetheneielse 62 :t> 



(LetNew'] 



Aq- B- L \- ep : vq A\ B-L, X ■. {io,y~ct .{Ap, vp}) \- e 



A-,B',L\- let 2 : = 6o in 6 : v 
where io = Ind(eo, Ind(Ao U B U L)) and = FTV(Ao) U FTV(fo) 



X G FV(e) 



(ID 2 ) 0; B, X : (i, yii ,v)-,L h x : where (Dom(B) U {x}) n Dom(B) = 0 

(ID3) s^-^j(A);B;L,® : (i, Vci’.(A,t;)) h® :s{7^}();) where Dom(B)n (Dom(B)u{a;}) = 0 



Fig. 6. Typing rules for if-expressions and indexes manipulation 

/ , I If,Let,Rec\ 

(system ) 



Principal Typings and Type Inference for |-^f^Eet,Rec^ Principal typings 
w.r.t. a library environment B and a local environment L are defined as for 
l_Let,Rec Definition |5]) . The following result holds. 

Theorem 3 (Principal typing property for p((Tet,Rec^^ jy ^ typable in 
l_R^Let,Rec then it has a principal typing w.r.t. B and L. 

System |-((Tet,Rec ^ complete inference algorithm (not included in this 

paper) that, for any expression e, library environment B, and local environment 
L such that Dom(B) n Dom(L) = 0, computes a principal typing for e w.r.t. B 
and L. 

Acknowledgements. I thank Mario Coppo, Paola Giannini, and the referees of earlier 
versions on this paper, for suggestions to improve the presentation. 
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Abstract. We present an approach for the rule-based transformation 
of hierarchically structured (hyper)graphs. In these graphs, distinguished 
hyperedges contain graphs that can be hierarchical again. Our framework 
extends the double-pushout approach from flat to hierarchical graphs. In 
particular, we show how to construct recursively pushouts and pushout 
complements of hierarchical graphs and graph morphisms. To further en- 
hance the expressiveness of the approach, we also introduce rule schemata 
with variables which allow to copy and to remove hierarchical subgraphs. 



1 Introduction 

Recently, the idea of using rule-based graph transformation as a framework 
for specification and programming has received some attention, and several re- 
searchers have proposed structuring mechanisms for graph transformation sys- 
tems to make progress towards this goal (see for example |2l |8]|T0|). Structuring 
mechanisms will be indispensable to manage large numbers of rules and to de- 
velop complex systems from small components that are easy to comprehend. 
Moreover, we believe that it will be necessary to structure the graphs that are 
subject to transformation, too, in order to cope with applications of a realistic 
size. A mechanism for hiding (or abstracting from) subgraphs in large graphs will 
facilitate both the control of rule applications and the visualization of graphs. 

In this paper we introduce hierarchical hypergraphs in which certain hyper- 
edges, called frames, contain hypergraphs that can be hierarchical again, with 
an arbitrary depth of nesting. We show that the well-known double-pushout 
approach to graph transformation extends smoothly to these hierarchi- 

cal (hyper)graphs, by giving recursive constructions for pushouts and pushout 
complements in the category of hierarchical graphs. Hierarchical transformation 
rules consist of hierarchical graphs and can be applied at all levels of the hi- 
erarchy, where the “dangling condition” known from the transformation of flat 
graphs is adapted in a natural way. 

* This work has been partially supported by the ESPRIT Working Group Applica- 
tions of Graph Transformation (Appligraph) and by the TMR Research Network 
Getgrats through the University of Bremen. 

** On leave from Universitat Bremen. 

J. Tiuryn (Ed.): FOSSACS 2000, LNCS 1784, pp. 98- ITntl 2000. 

(c) Springer- Verlag Berlin Heidelberg 2000 
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To further enhance the expressiveness of hierarchical graph transformation 
for programming purposes (without damaging the theory), we also introduce 
rule schemata containing frame variables. These variables can be instantiated 
with frames containing hierarchical graphs, and can be used to copy or remove 
frames without looking at their contents. Our running example of a queue im- 
plementation indicates that this concept is useful, as it allows to delete and to 
duplicate queue entries regardless of their structure and size. 

Finally, we relate hierarchical graph transformation to the conventional trans- 
formation of flat graphs by introducing a flattening operation. Flattening recur- 
sively replaces each frame in a hierarchical graph by its contents, yielding a flat 
graph without frames. Every transformation step on hierarchical graphs — under 
a mild assumption on the transformed graph — gives rise to a conventional step 
on the flattened graphs by using the flattened rule. 



2 Graph Transformation 

If S' is a set, the set of all flnite sequences over S, including the empty sequence 
A, is denoted by S*. The zth element of a sequence s is denoted by s(z), and its 
length by |s|. If /: S — *■ T is a function then the canonical extensions of / to 
the powerset of S and to S* are also denoted by /. The composition g o f oi 
functions f:S^T and g: T — > [/ is deflned by (g o /)(s) = g{f{s)) for s € S. 

A pushout in a category C (see, e.g., [T|) is a tuple (mi, m2, ni, 712) of mor- 
phisms rrii . O ^ Oi and : Oi ^ O' with ni o mi = ri2 o m2, such that for all 
morphisms n['. Oi P (i G { 1 , 2 }) with n{ o mi = n'2 o m2 there is a unique 
morphism n: O' P satisfying no rii = and n o ri2 = 

Let L be an arbitrary but fixed set of labels. A hypergraph H is a, quintuple 
{Vh,Eh, attn, labH,PH) such that 

— Vh and Eh are flnite sets of nodes and hyperedges, respectively, 

— attn '■ Eh — > is the attaehment funetion, 

— lahn '■ Eh —^Lis the labelling function, and 

— Ph & Vfi is a sequence of nodes, called the points of H. 

In the following, we will simply say graph instead of hypergraph and edge instead 
of hyperedge. We denote by Ah the set Vh U Eh of atoms of H. In order to 
make this a useful notation, we shall always assume without loss of generality 
that Vh and Eh are disjoint, for every graph H. 

A morphism m: G ^ H between graphs G and El is a pair (rnv,mE) of 
mappings my'- Vg Vh and mE- Eg — > Eh such that myipa) = Ph and, 
for all e S Eg, labH{rriE{e)) = labG{e) and attH{mE{e)) = my{attG{e)). Such 
a morphism is injective {surjective, bijective) if both my and mE are injective 
(respectively surjective or bijective). If there is a bijective morphism m: G ^ H 
then G and H are isomorphic, which is denoted hy G = H. For a morphism 
m: G ^ H and a G Ag we let m(a) denote my {a) if a G Vg and mE{a) if 
a G Eg- The composition of morphisms is deflned componentwise. 
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For graphs G and H such that Acf^An = 0 , the disjoint union G+H yields 
the graph {Vo U Vh,Eg U Eh, att, lab^pa), where 



,, i attcie) \i e & Eg , , ,, , i lahG{e) \l e & Eg 

= I attH{e) otherwise = | lahn^e) otherwise 

for all edges e S Eg U Eh- (If Ag n Ah yf 0 , we assume that some implicit 
renaming of atoms takes place.) Notice that this operation is associative but 
does not commute since G + H inherits its points from the first argument. 

We recall the following well-known facts about pushouts and pushout comple- 
ments in the category of graphs and graph morphisms (see | 5 ]). Let mi: G ^ Hi 
and 7712 : G — > H2 be morphisms. Then there is a graph H and there are mor- 
phisms ni: Hi ^ H and n2'- H2 — *■ H such that (7771,7772,771,772) is a pushout. 
Furthermore, H and the are determined as follows. Let H' be the disjoint 
union of Hi and H2, and let ~ be the equivalence relation on Ah> generated 
by the set of all pairs (mi (a), 7772(a)) such that a € Ag- Then H is the graph 
obtained from H' by identifying all atoms a, a' such that a ~ a' (i.e., H is the 
quotient graph iJ'/~). Moreover, for i G { 1 , 2 } and a G Ah^, ni{a) = [a]...,, where 
[a]... denotes the equivalence class of a according to ~. 

In order to ensure the existence and uniqueness of pushout complements (i.e., 
the existence and uniqueness of m2 and 712 if mi and 77 1 are given), additional 
conditions must be satisfied. Below, we only need the case where both of the 
given morphisms are injective. In this case it is sufficient to assume that the 
dangling condition is satisfied. Two morphisms mi: G Hi and 77 1 : Hi —>■ H 
satisfy the dangling condition if no edge e G Eh \ ni(EH^) is attached to a node 
in 77 i(Vt 7 j) \ 77 i(mi(V( 3 ))- It is well-known that, if mi and t 7 i are injective, then 
there are m2 and 772 such that (mi, m2 , t 7 i, 772) is a pushout, if and only if mi 
and 77 i satisfy the dangling condition. Furthermore, if they exist, then m2 and 
772 are uniquely determined (up to isomorphism). 

A transformation rule {rule, for short) is a pair t: of morphisms 

1 : I ^ L and r: I ^ R such that I is injective. L, I, and R are the left-hand 
side, interface, and right-hand side of t- A graph G can be transformed into a 
graph H by an application of t, denoted by G H, if there is an injective 
morphism o: L ^ G, called an occurrence morphism, such that two pushouts 




exist. It follows from the facts about pushouts and pushout complements recalled 
above that such a diagram exists if and only if I and o satisfy the dangling 
condition, and in this case H is uniquely determined up to isomorphism. Notice 
that we only consider injective occurrence morphisms, which is done in order to 
avoid additional difficulties when considering the hierarchical case. On the other 

• I V , ... 

hand, the morphism r of a rule t: L <— / — > i? is allowed to be non-injective. 
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3 Hierarchical Graphs 

Graphs as defined in the previous section are fiat. If someone wished to imple- 
ment, say, some complicated abstract data type by means of graph transforma- 
tion, there would be no structuring mechanisms available, except for the possi- 
bilities the graphs themselves provide. Thus, any structural information would 
have to be coded into the graphs, a solution which is usually inappropriate and 
error-prone. To overcome this limitation, we introduce graphs with an arbitrarily 
deep hierarchical structure. This is achieved by means of special edges, called 
frames, which may contain hierarchical graphs again. In fact, it turns out to 
be useful to be even more general by allowing some frames to contain variables 
instead of graphs. These structures will be called hierarchical graphs. 

Let df be a set of symbols called variables. The class Ti,{X) of hierarchical 
graphs with variables in X consists of triples H = (G, F, cts) such that G is a 
graph (the root of the hierarchy), F C Eq is the set of frame edges (or just 
frames), and cts: F — > H{X) U X assigns to each frame / G F its contents 
cts{f) G H{X) U X. Formally, H{X) is defined inductively over the depth of 
frame nesting, as follows. A triple H = (G, F, cts) as above is in Ti.o{X) if F = 0. 
In this case, FI may be identified with the graph G. For i > 0, H G Ti,i{X) if 
cts{f) G 'Hi-i{X) U A for every frame f G F. Finally, H{X) denotes the union 
of all these classes: H{X) = (Notice that Hi{X) C Hi+i{X) for 

all j > 0. We have Ho{X) C 'Hi(X) because an empty set of frames trivially 
satisfies the requirement; using this, Tii{X) C Tti+i{X) follows by an obvious 
induction on i > 0.) The sets 7f(0) and Fi(0) {i > 0) are briefiy denoted by Ti 
and Tii, respectively. These variable-free hierarchical graphs are those in which 
we are mainly interested. 

Notice that, to avoid unnecessary restrictions, the definition of a hierarchical 
graph F[ = (G, F, cts) does not impose any relation between the nodes and 
edges of G and those of cts{f), f G F. Restrictions of this kind may be added 
for specific application areas, but the results of this paper hold in general. 

Example 1 (Queue graphs). As a running example, we show how queues and 
their typical operations can be implemented using hierarchical graph transfor- 
mation. Two kinds of frames are used to represent queues as hierarchical graphs: 
Unary item frames contain the graphs stored in the queue; binary queue frames 
contain a queue graph, which is a chain of edges connecting their begin point to 
their end point, every node in between carrying an item frame. 

Figure [T] shows two queue frames. Nodes are drawn as circles, and filled if 



(a) o 




Fig. 1. Two queue frames representing (a) an empty queue (b) a queue of 
length 3 
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they are points. Edges are drawn as boxes, and connected to their attachments 
by lines that are ordered counter-clockwise, starting at noon. Frames have double 
lines, and their contents is drawn inside. Plain binary edges are drawn as arrows 
from their first to their second attachment (as in simple graphs). In our examples, 
their labels do not matter, and are omitted. (In the item graphs, the arrowheads 
are omitted too.) Frame labels are not drawn either, as queue and item frames 
can be distinguished by their arity. 

Note that item frames may contain graphs of any arity; in Figure [I] (b), they 
have 1, 2, and no points, respectively. 

Unless they are explicitly named, the three components of a hierarchical 
graph H are denoted by H, Fh, and ctsn, respectively. The notations Vh, Eh, 
attn, labn, Ph, and Ah are used as abbreviations denoting Vjj, Ejj, attjj, 
lahjj, pjj, and Ajj, respectively. Furthermore, we denote by Xh the set {f G 
Fh I ctsnif) G X} of variable frames of H and by 

var{H) = ctsHiXn) U [J var{ctsH{f)) 

f&F„\XH 



the set of variables occurring in F[. 

Let G and H be hierarchical graphs such that Aq H Ah = 0. The disjoint 
union of G and is denoted hy G + H and yields the hierarchical graph K 
such that K = G + F[, Fk = Fq U Fh, and ctsK{f) equals ctscif) if f G Fq 
and ctsnif) if f & Fh- For a hierarchical graph G and a set S' = {Hi, . . . , 
of hierarchical graphs, we denote G + Hi + ■ ■ ■ + Hn by G -1- J2hgS (Notice 
that, although the disjoint union of hierarchical graphs does not commute, this 
is well defined as it does not depend on the order of Hi,. , Hn). 

We will now generalize the concept of morphisms to the hierarchical case. 
The definition is quite straightforward. Such a hierarchical morphism h: G H 
consists of an ordinary morphism on the topmost level and, recursively, hierar- 
chical morphisms from the contents of non-variable frames to the contents of 
their images. Naturally, only variable frames can be mapped to variable frames, 
but they can also be mapped to any other frame carrying the right label. 

Formally, let G,H G H{X). A hierarchieal morphism /i: G — > is a pair 

h = (h, (/i-^)/6Fg\Xg) where 

— h: G H is a, morphism, 

— h{f) G Fh for all frames / G Fq, where h{f) G Xh implies / G Xq, and 

— : ctscif) — *■ ctsnihif)) is a hierarchical morphism for every / G Fc\Xc. 

For atoms a G Aq, we usually write h{a) instead of /i(a). Furthermore, a hier- 
archical morphism h: G ^ H for which G,H G Hq is identified with h. 

The composition h o g of hierarchical morphisms g: G ^ H and h: H ^ L 
is defined in the obvious way. It yields the hierarchical morphism 1: G —> L such 
that l = hog and, for all frames / G Fq \ Xq, 1^ = h^H) o gf . The hierarchical 
morphism g is injective if g is injective and, for all / G Fq \ Xq, gf is injective. 
It is surjective up to variables if g is surjective and, for all / G Fq \ Xq, gf is 
surjective up to variables. Finally, g is bijective up to variables if it is surjective up 
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to variables and injective. If G does not contain variables, we speak of surjective 
and bijective hierarchical morphisms. A bijective hierarchical morphism is also 
called an isomorphism, and G, H G Ti are said to be isomorphic, G = H, ii there 
is an isomorphism m: G — *■ H . 

Let H be the category whose objects are variable-free hierarchical graphs 
and whose morphisms are the hierarchical morphisms h: G H with G,H £ hi 
(which is indeed a category, as one can easily verify). The main result we are 
going to establish in order to obtain a notion of hierarchical graph transformation 
is that H has pushouts. For this, looking at the inductive definition of hierarchical 
graphs and their morphisms, it is a rather obvious idea to proceed by induction 
on the depth of the frame nesting. The induction basis is then provided by 
the non-hierarchical case recalled in Section [H In order to use the induction 
hypothesis, we have to reduce the depth of a hierarchical graph in some way. This 
can be done on the basis of a rather simple construction. Given a hierarchical 
graph H G Hi, we take the contents of its frames out of these frames (which, 
thereby, become ordinary edges) and add them disjointly to H, thus obtaining 
a hierarchical graph in Hi-i (provided that i > 0). Denoting this mapping by 
ip, we get the desired theorem, which is the main result of this section. It states 
that the category H has pushouts, and the proof shows how to construct them 
effectively. 



Theorem 1. For every pair m \ : G ^ H\ and m2 : G — > H2 of morphisms in HI 
there are morphisms ni : H\ — > H and U2 '■ H2 —>■ H in W (for some hierarchical 
graph H) such that (mi, m2, ui, ri2) is a pushout. Furthermore, (jni,m2,ni, nd) 
is a pushout in the category of graphs. 



Proof sketch. The proof works by induction on i, where Hi, H2 € Hi. The case 
f = 0 is the non-hierarchical one, and it is easy to see that every pushout in the 
category of non-hierarchical graphs and morphisms is a pushout in H as well. 
Thus, let i > 0. Extending p to morphisms in the canonical way, one obtains 
p{mi) = {rn( \ G' H[) and p{m2) = (m^ : G' ^ H!^ where H[,H'2 G Hi-\. 
By the induction hypothesis, this yields a pushout (m'l, m^, n'l, for some 
n' : H' {j G {1,2}). Now, it can be shown that n' = p{nj) for hierar- 

chical morphisms Uj : Hj — s- H, yielding a commuting square (mi, m2, ui, U2)- 
Intuitively, the parts of H' which stem from the contents of a frame / in Hj can 
be stored in n'j{f), turning this edge into a frame of the hierarchical graph H 
constructed. The main part of the proof is to show that H and the hierarchical 
morphisms Uj obtained in this way are well defined. 

Finally, one has to verify the universal pushout property of (mi, m2, n\, 712)- 
Let li : Hi L and I2 '. H2 — *■ L be such that (mi, m2, h, h) commutes and let 
p{lj) = {I'j. H' — > L’) for j G {1,2}. Then (m), m^, ^2) commutes as well. 

Therefore, the pushout property of {m(,m'2,n'i,n'2) yields a unique morphism 
I' : H' ^ L' such that l( = V o n'j. Again, V can be turned into F. H ^ L with 
I' = p{l) and Ij = I o Uj for j G {1, 2}. Furthermore, for k: H ^ L with k yf I 
we have p{k) yf p{l), which shows that I is unique, by the uniqueness of I' . □ 
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Notice that the proof of Theorem [I] yields a recursive procedure to construct 
pushouts in H, based on the construction of pushouts in the case of ordinary 
graph morphisms. 

The construction in the proof of the theorem yields a corollary for the special 
case where mi and m^ are injective. Obviously, in this case the hierarchical 
morphisms m'l and in the proof are also injective. As a consequence, it 
follows that (to{, 7712 , ^ pushout for every frame / G Fq- This 

yields the following specialization of Theorem [TJ 

Corollary 1. Let mi : G — *■ Fli and m^'. G —f H2 be injective hierarchical mor- 
phisms in H. Then, one can construct hierarchical morphisms ni : Hi — > H and 
712: H2 — > H such that (jni,m2,ni,n2) is a pushout, as follows: 

— fij and 772 o-re such that {mi, m2, nj, nf) is a pushout, 

— for every frame f G Fq, 77™^^'^^ and 77™^^^^ are constructed recursively in 

such a way that {m(,m2,n^^^^\n^^^^'^) is a pushout, and 

— for every frame f G Ffr. \ mi{FG) (i G { 1 , 2 }^, n{ is an isomorphism. 

Next, we shall see how pushout complements can be obtained. For simplicity, 
we consider only the case where the two given hierarchical morphisms are both 
injective. This enables us to make use of Corollary [T] in an easy way, whereas 
the more general case would be unreasonably complicated as it required a hier- 
archical version of the so-called identification condition . 

Clearly, in order to ensure the existence of pushout complements, a hier- 
archical version of the dangling condition must be satisfied. However, for the 
hierarchical case it must also be required that, intuitively, no frame is deleted 
unless its contents is deleted as well. Let Hi G H{X) and G,H gH (right be- 
low, we shall only use the following definition for Hi G Tt, but later on the more 
general case Hi G TL{X) will turn out to be valuable, too). Two hierarchical 
morphisms m: I L and n: L ^ G satisfy the hierarchical dangling condition 
{dangling condition, for short) if 

— 777 and n satisfy the (non-hierarchical) dangling condition, 

— for every frame / G F’l \ {^{Fj) U Xl), is bijective up to variables, and 

— for every frame / G F/ \ Xj, m^ and 77™(f) satisfy the dangling condition. 

Notice that this condition coincides with the usual one in the special case 
where m and n are ordinary graph morphisms, because in this case only the first 
requirement is relevant as there are no frames. Intuitively, the second part of 
the condition states that, as mentioned above, a frame can be deleted only if its 
contents is deleted as well (at least in the case where L G TC; the more general 
case is not yet our concern). As the proof below shows, this corresponds to the 
last item in Corollary [T] (and is thus indeed necessary). 

Theorem 2. Let mi : G — > Hi and ni : Hi ^ H be injective hierarchical 
morphisms in H. Then there are hierarchical morphisms m2 '■ G H2 and 
772: H2 — > H such that (7771,7772,771,772) is a pushout, if and only if mi and ni 
satisfy the dangling condition. Ln this case m2 and 772 are uniquely determined. 
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Proof. Let G &TLi. Again, we proceed by induction on i. Clearly, if m 2 and U 2 
exist, then m 2 must be injective since riiomi = n 2 °m 2 is injective. By Corollary[I] 
this means that m 2 and tl 2 exist if and only if they can be constructed in such 
a way that the following are satisfied: 

(1) m 2 and fi 2 are such that (mi, m 2 , rTf) is a pushout, 

(2) for every frame / S the hierarchical morphisms m^ and are 

constructed recursively, so that (m{, m^ , is a pushout, and 

(3) for every frame / G Fni \ mi{Fo) {i € {1, 2}), n( is an isomorphism. 

As mi and rii satisfy the dangling condition, m 2 and U 2 exist and are uniquely 
determined (since mf and ff[ satisfy the dangling condition for non-hierarchical 
morphisms), and (3) is satisfied for f = 1 (because of the second part of the 
dangling condition). Furthermore, the induction hypothesis yields the required 
hierarchical morphisms m| and satisfying (2), for every frame / G Fq. 

Together with the remaining requirement in (3) (i.e., the case where i = 2) this 
determines m 2 and ri 2 up to isomorphism, thus finishing the proof. □ 

4 Hierarchical Graph Transformation 

Based on the results presented in the previous section we are now able to define 
rules and their application in the style of the double-pushout approach. From 

I T* 

now on, a rule t: L <—/—!■ i? is assumed to consist of two hierarchical morphisms 
l\ I ^ L and r: I ^ R, where L,I,R G Ti and I is injective. The hierarchical 
graphs L, I, and R are called the left-hand side, interface, and right-hand side. 

The application of rules is defined by means of the usual double-pushout 
construction, with one essential difference. In order to make sure that transfor- 
mations can take place on an arbitrary level in the hierarchy of frames (rather 
than only on top level) one has to employ recursion. 

Definition 1 {Transformation of hierarchical graphs). Let t: L<^I^R 
be a rule. A hierarchical graph G G Ti. is transformed into a hierarchical graph 
H Gli hy means of t, denoted by G =^t H, if one of the following holds: 

(1) There is an injective hierarchical morphism o: L ^ G, called an occurrence 
morphism, such that there are two pushouts 



o 

G^ K 

in H, or 

(2) H = G via some isomorphism m: G ^ Ft, and there is a frame / G Fq 
such that ctsa{f) ctsH{m{f)) and ctsH{'m{f')) = ctscif) for all f G 

FcMf}- 

For a set T of rules, we write G =^t HUG =>t H for some t G T. 



R 



H 
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Fig. 2. The concatenation rule and its application 



Example 2 ( Concatenation of queues). In Figure^ we show a concatenation rule 
for queues that identifies two queue frames and concatenates their contents, and 
a transformation with this rule. The digits in thPrule indicate how the nodes of 
the graphs have to be mapped onto each other. 

It should be noticed that the definition of transformation steps requires oc- 
currence morphisms to be injective. Therefore, we need three variants of this 
rule where node 1 is identified with node 2, or 7 with 8, or both 1 with 2 and 7 
with 8. (Similar variants are needed for the rules in the subsequent examples.) 

Since occurrence morphisms are injective, we get the following theorem as a 
consequence of Theorems [T] and |2] 

Theorem 3. Let t: a rule, G GTi, and o: L ^ G an occurrence 

morphism. Then the two pushouts in Definitions^ 1 ) exist if and only if o satisfies 
the dangling condition^ Furthermore, in thi^case the pushouts are uniquely 
determined up to isom(^hism. 

Proof. By Theorem [2] the pushout on the left exists if and only if the dangling 
condition is satisfiec^and if it exists then it is uniquely determined up to iso- 
morphism. Finally, by Theorem[T]the pushout on the right always exists, and it 
is a general fact known from ca|^ory theory that a pushout {mi,m 2 ,ni,n 2 ) is 
uniquely determined (up to isomorphism) by the morphisms mi and m 2 . □ 

The reader should also notice that, as a consequence of the effectiveness of 
the results presented in Section |3] given a tranformation rule, a hierarchical 
graph, and an occurrence morph[s)ii satisfying the dangling condition, one can 
effectively construct the required pushouts. 

^ If the rule t: L <)— I R in question is clear we say that o satisfies the dangling 
condition if I and o do. 
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Unfortunately, the notion of transformation of hierarchical graphs is not yet 
expressive enough to be satisfactory for certain programming purposes. There 
are some natural effects that one would certainly like to be able to implement as 
single transformation steps, but which cannot be expressed by rules. Consider 
the example of queues, for instance. It should be possible to design a rule dequeue 
which removes the first item in a queue, regardless of its contents. However, this 
is not possible as the dangling condition requires the occurrence morphism to 
be bijective on the contents of deleted frames. Conversely, another rule enqueue 
should take an item frame, again regardless of its contents, and add it to the 
queue — preferably without affecting the original item frame. In order to imple- 
ment this, one has to circumvent two obstacles. First, hierarchical morphisms 
preserve the frame hierarchy, which implies that, intuitively, rules cannot move 
frames across frame boundaries. Second, by now it is simply not possible to 
duplicate frames together with their contents. 

This is where variables start to play an important role. The idea is to turn 
from rules to rule schemata and to transform hierarchical graphs by applying 
instances of these rule schemata. In order to make sure that an occurrence mor- 
phism satisfying the dangling condition always yields a well-defined transforma- 
tion, we restrict ourselves to left-linear rule schemata. For this, a hierarchical 
graph H is called linear if no variable occurs twice in H . 

A variable instantiation for H € 7i{X) is a mapping i: var{H) — > 7i. The 
application of z to i? is denoted by H i. It turns every variable frame / £ Xh into 
a frame whose contents is i{ctsH{f))- By the definition of hierarchical morphisms, 
for every hierarchical morphism h\ G ^ H such that G £ and every variable 
instantiation i for H, h can as well be understood as a hierarchical morphism 
from G to Hi. In the following, this hierarchical morphism will be denoted by hi. 
Based on this observation, rule schemata and their application can be defined. 

Definition 2 {Transformation by rule schemata). A rule schema, denoted 

I V , . • • • • . 

by t: L ^ I —f R, is a pair consisting of hierarchical morphisms T. I ^ L and 
r: I ^ R, where L,R G TL{X), I G Ti., L is linear, and var{R) C var{L). If i is 
a variable instantiation for L then the rule t' \ Lii^ I ^ Ri is an instance of t. 

A rule schema t transforms G G Ti into H G Ti., denoted by G H, 
if G H for some instance t' of t. For a set T of rule schemata we write 
G H it G H for some t GT. 



Example 3 (The rule schemata enqueue and dequeue^. In Figure [3[ we show 
a rule schema that inserts a framed item graph at the tail of a queue graph, 
and a transformation with that rule. The item frame contains the variable x. 
Otherwise, it would not be possible to duplicate the item graph, and to move it 
into the queue frame. 

In Figure 2] we show a rule schema that removes the first item frame in a 
queue graph. The item graph is denoted by the variable x so that it can be 
removed entirely. 
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Fig. 3. The rule schema enqueue and its application 




Fig. 4. The rule schema dequeue 



For practical purposes Definition [5] is not very convenient because there are 
infinitely many instances of a rule schema as soon as it contains at least one 
variable. Therefore, the naive approach to implement by constructing all 
its instances and then testing each of them for applicability does not work. 
However, there is quite an obvious way how one can do better than that. Consider 
some linear hierarchical graph L G Ti(A’) and a hierarchical graph G G H, and 
let o: L — *■ G be a hierarchical morphism. Then, due to the linearity of L, o 
induces a variable instantiation lo'. var{L) — > Ti. and an occurrence morphism 
inst{o): Lio G, as follows. For all x G var{L), if there is some / G such 
that ctsL(f) = X then io{x) = ctso{o{f)). Otherwise, io{x) = Zof(x), where 
f G Fl \ Xl is the unique frame such that x G var{ctsL{f))- Furthermore, 
inst{o) = 0 and for all / G Fl, inst{oY is the identity on ctsG(o{f)) if / € Xl 
and inst{oY = inst{o^) otherwise. 

The theorem below states that the transformations given by a rule schema 

1 T* 

t: L I R can be obtained by considering occurrence morphisms o: L G 
that satisfy the dangling condition. 
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Theorem 4. Let t: L ^ I ^ R be a rule sehema and G GR.. 

1. If O'. L ^ G is an occurrence morphism satisfying the dangling condition, 
then inst{o) is an occurrence morphism for Lig satisfying the dangling con- 
dition. 

2. If V. var{L) ^ hi is a variable instantiation and q: Li ^ G is an occurrence 
morphism satisfying the dangling condition, then i = lo and q = inst(p) 
(up to isomorphism) for some occurrence morphism o: L ^ G satisfying the 
dangling condition. 

The proof by induction on i, where L G is rather straightforward and 

is therefore skipped in this short version. 

5 Flattening 

A natural operation on hierarchical graphs is the flattening operation which 
removes the hierarchy by recursively replacing every frame with its contents. 
For this, we use the well-known concept of hyperedge replacement (see m in 
a slightly generalized form. Flattening is similar to (a recursive version of) the 
operation ip considered in Sectional but it removes all frames and identifies their 
attached nodes with the corresponding points of their contents. If the numbers 
of attached nodes and points differ, the additional nodes of the longer sequence 
are treated like ordinary nodes. In addition, flattening forgets about the points 
of its argument, so that the resulting graph is “unpointed” . 

It will be shown in this section that, under modest assumptions, hierarchical 
graph transformation is compatible with the flattening operation: A transforma- 
tion G H induces a corresponding transformation G' =>t' H' , where G' , H', 
and t' are the flattened versions of G, H, and t, respectively. 

In order to proceed, we first need to define hyperedge replacement on hierar- 
chical graphs. Let H he & hierarchical graph and consider a mapping a: E Ti. 
such that E C Eh, called a hyperedge substitution for H. Hyperedge replacement 
yields the hierarchical graph II[a] obtained from H -|- deleting the 

edges in E and identifying, for all e G E, the ith node of attnie) with the ith 
point of Pa{e), for £^11 * such that both these nodes exist. 

Finally, for all H G R, let fl{H) = H[u] where a: Eh ^ R is given induc- 
tively by cr(/) = fi{ctsH{f)) for all / G Eh. Then, the flattening of H yields the 
graph flat{H) = {Vfl(^H)iEpf^H), For most of the considera- 

tions below, it is sufficient to study the mapping /?, which removes the hierarchy 
without forgetting points, instead oi fiat. 

We can flatten morphisms as well. Consider a hierarchical morphism h: G ^ 
H with G, H G R and let cr = /? o ctso and t = fi o ctsn- Then, fl{h) is 
the morphism m: fl{G) — > fl{H) defined inductively, as follows. For all a G 
AfpQp if a € Aq then m(a) = h{a), and if a G ^o-(f) for some / G Fq then 
m{a) = fi{h-i){a). Furthermore, fiat{h) = (m': fiat{G) flat{H)) is given by 
m’{a) = m{a) for all a G Aflat(G)- (Notice that, although the two cases in the 
definition of m{a) above intersect, they are consistent with each other.) 
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Above, it was mentioned that the main result of this section holds only 
under a certain assumption. The reason for this is that a morphism flat{h) 
may be non-injective although h: G ^ H itself is injective. This is caused by 
the fact that building fl{G) may identify some nodes in Vq because they are 
incident with a frame whose contents has repetitions in its point sequence. If 
the attached nodes of the frame are distinct, hyperedge replacement identifies 
them (by identifying each with the same point of the contents). Thus, flattening 
may turn an occurrence morphism into a non-injective morphism, making it 
impossible to apply the corresponding flattened rule. In fact, the dual situation 
where there are identical attached nodes of a frame while the corresponding 
points of its contents are distinct, must also be avoided. The reason lies in the 
recursive part of the definition of =^t . If a rule is applied to the contents of some 
frame, but the replacement of the frame identifies two distinct points of the 
contents because the corresponding attached points of the frame are identical, 
the flattened rule cannot be applied either. 

For this, call a hierarchical graph H G Ti. identification consistent if every 
frame / G Fh satisfies the following: 

(1) For alli,j G [mm{\attH{f)\ , \PctsH(f)\)]^ attH{f){i) = attH{f){j) if and only 
ifPcts„(/)(0 = Pcts„if){j), and 

(2) ctsnif) is identification consistent. 

The reader ought to notice that identification consistency is preserved by 

• • It,,,,, , , 

the application of a rule t: L-^ I —> R if R is identification consistent and r is 
injective. Thus, if we restrict ourselves to systems with rules of this kind then 
all derivable hierarchical graphs are identification consistent (provided that the 
initial ones are). 

It is not very difficult to verify the following two lemmas. 

Lemma 1. For every injective hierarchical morphism h\ G ^ FI (G,H G Ti.) 
such that F[ is identification consistent, fl{h) is injective. 



Lemma 2. If (mi, TO 2 , ni, 712 ) is a pushout in H, then {flat (mi), flat {m 2 ), 
flat {ni), flat {u 2 )) is a pushout as well. 

As a consequence, one obtains the main theorem of this section: If a rule can 
be applied to an identification consistent hierarchical graph, then the flattened 
rule can be applied to the flattened graph, with the expected result. 

Theorem 5. Let t: L ^ I ^ R be a rule and let t' : L' I' ^ R' be the rule 
given by I' = flat{l) and r' = flat{r). For every transformation G =>t H such 
that G is identification consistent, there is a transformation flat {G) =kt' flat{H). 

Proof sketch. Consider a transformation step G =kt H . Due to the definition of 
=>t there are two cases to be distinguished. If there is a double-pushout dia- 
gram as in the first case of Definition [T] Lemmas [T] and [2] yield a corresponding 
“flattened” diagram. The second case to be considered is the recursive one, i.e., 
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the transformation takes place inside a frame f. In this case it may be assumed 
inductively that the diagram corresponding to a transformation of the flattened 
contents of / exists. Due to the assumed identification consistency the flattened 
contents of / is injectively embedded in flat{G). Therefore, the given diagram 
can be extended to a larger pushout diagram in the required way, retaining the 
injectivity of the occurrence morphism. □ 

It should be noticed that the flattening process implies a loss of crucial struc- 
tural information so that there is no chance to prove the converse of the theorem. 



6 Conclusion 

We conclude this paper by briefly mentioning some related work and possible 
directions for future research. 

Pratt [T^ was probably the first to consider a concept of hierarchical graph 
transformation, where he used a certain kind of node replacement to define the 
semantics of programming languages. His graph concept was extended in 
by allowing edges between subgraphs contained in different nodes, but without 
defining transformation. 

A different concept of graph nesting is given by the abstraction mechanisms 
of the (old) graph transformation system Agg [T2] and the multi-level graph 
grammars of m, providing flat graphs with several views which are related by 
a rigid layering and a partial inclusion ordering, respectively. 

An indirect nesting concept can be found in the framework of m and the new 
Agg system [7j, where nesting is realized by labels and attributes, respectively. 

The idea of using variables to extend the double-pushout approach with 
non-local effects, like copying and removal of subgraphs, is also followed in the 
so-called substitution-based approach to graph transformation m ( working on 
flat hypergraphs). 

One direction for future work on hierarchical graph transformation is to lift 
to the hierarchical setting the classical results of the double-pushout approach, 
like sequential and parallel commutativity, results on parallelism, concurrency 
and amalgamation, etc. Another important task is to combine hierarchical graph 
transformation in an orthogonal way with concepts for structuring and control- 
ling systems of rules. As mentioned in the introduction, several such concepts 
(mainly for flat graphs) have recently been proposed in the literature. 

A further topic of research is to develop hierarchical graph transformation 
towards object-oriented graph transformation, as outlined in m- There the idea 
is to restrict the visibility of frames so that only rules designated to some frame 
type may inspect or update the contents of frames of this type. Such frame types 
come close to “classes”, and the designated rules correspond to “methods”. In 
this way frames can be seen as objects of their types that can only be manipulated 
by invoking the methods of the class. 
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Abstract. This paper develops a highly expressive semantic framework 
for program refinement that supports both temporal reasoning and rea- 
soning about the knowledge of a single agent. The framework generalizes 
a previously developed temporal refinement framework by amalgamat- 
ing it with a logic of quantified local propositions, a generalization of 
the logic of knowledge. The combined framework provides a formal set- 
ting for development of knowledge-based programs, and addresses two 
problems of existing theories of such programs: lack of compositionality 
and the fact that such programs often have only implementations of high 
computational complexity. Use of the framework is illustrated by a con- 
trol theoretic example concerning a robot operating with an imprecise 
position sensor. 



1 Introduction 

The knowledge-based approach to the design and analysis of distributed sys- 
tems, introduced by Halpern and Moses |B] involves the use of modal logics of 
knowledge. One of the key contributions of this approach is the notion of knowl- 
edge-based programs | 5I4| . which generalize standard programs by allowing the 
tests in conditional constructs to be formulas in the logic of knowledge. Such 
programs contain statements of the form “if you know that X then do A else 
B” . This provides a high level abstraction of distributed programs that allows 
for perspicuous descriptions of how an agent’s actions are related to its state of 
information (which, in a distributed system, is typically incomplete) about its 
environment. 

In its current state of development, the knowledge-based approach has a 
number of limitations, among them that: 

1. The formal methodology for developing and reasoning about knowledge- 
based programs is at present only weakly developed. 

J. Tiuryn (Ed.): FOSSACS 2000, LNCS 1784, pp. 114- TrM 2000. 

(c) Springer- Verlag Berlin Heidelberg 2000 



A Program Refinement Framework 115 



2. The existing semantics for knowledge-based programs is based on a par- 
ticular interpretation of knowledge that requires a complete description of 
the implementing program. This prevents the compositional development of 
program fragments. 

3. Knowledge-based programs often have only implementations of unacceptably 
high computational complexity. 

This paper is a step in the direction of the formulation of the knowledge-based 
approach that addresses these limitations. 

One of the starting points for our work is the observation that knowledge- 
based programs are in one respect more like specifications than like standard 
programs. They cannot be directly executed — instead, their meaning is defined 
by a relation of “implementation” between knowledge based programs and stan- 
dard programs: a given knowledge-based program may have no, one, or many 
concrete programs as its implementations. As a specification formalism, however, 
knowledge-based programs are unbalanced, abstracting only the tests performed 
by agents, but providing no abstraction mechanism for their actions m- 

Action abstraction is handled much better in refinement calculi [T MTUl , also 
known as “broad spectrum” languages. Such calculi view programs and speci- 
fications as having the same semantic type, and support a formal methodology 
for the development of programs that are “correct by design”, where one be- 
gins with a specification and transforms it to an implementation by means of a 
sequence of correctness preserving refinement steps. The focus in this area has 
been on sequential programs and atemporal assertions but recently some ap- 
proaches to refinement admitting the expressive power of temporal logics have 
been developed [HE]. 

A first step in the direction of a refinement calculus suited to the knowledge- 
based development of programs was taken in van der Meyden and Moses mm, 
where it is shown how to develop a refinement approach capturing certain types 
of temporal reasoning that will be critical in knowledge-based program develop- 
ment. We further develop these ideas in the present paper, by showing how they 
may be extended to accommodate knowledge-based reasoning. Significantly, the 
framework we define admits compositional program development. 

In developing the extension, we also seek to address the final limitation of 
knowledge-based programs alluded to above. To implement the statement “if 
you know that X then do A else B”, a concrete program must do A exactly 
when it is in a local state (captured by the values of the variables and storage it 
maintains locally) that carries the information that X is true. The difficulty with 
this is that computing whether a local state bears the information that X may 
have very high computational complexity |12I15I18] . As argued by Sanders [H] 
and us |3], in practice, it may often be sufficient to use conditions on the agent’s 
state of information that are sound, but not complete, tests of its knowledge. 
Such tests may be expressed in the Logic of Local Propositions (LLP) |^. 

The present paper integrates the temporal refinement framework of van der 
Meyden and Moses m with the logic of local propositions. Although our ulti- 
mate aim is a framework for the development of distributed systems, we deal 
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in this paper with a single agent operating synchronously with its environment: 
asynchrony and multiple agents introduce complexities that we plan to address 
in the future. The main novelty is the introduction of a programming/speci- 
fication construct that resembles a quantification over local propositions. This 
construct makes it possible to write specifications stating that the agent condi- 
tions its behaviour on a local test for some property of interest, without stating 
explicitly what test is used. The introduction of this construct necessitates an 
adaptation of the semantics of the temporal refinement of |16| . 

The paper is structured as follows. Section [2| defines an assertion language 
that adapts the LLP semantics to the richer temporal setting required for rea- 
soning about programs. Section defines the syntax and semantics of our broad 
spectrum programming and specification language that incorporates the asser- 
tion language from Sect. Section[4| defines the semantic refinement relation we 

use for this class of programs and develops a number of refinement rules valid 
for this relation. Section illustrates the use of the framework by presenting a 
formal development of a control theoretic example previously treated informally 
in the literature on knowledge-based programs. 



2 A Semantics for Reasoning about Knowledge and Time 

We begin by presenting a semantic framework for a single agent and its environ- 
ment, inspired by [4], to which we refer the reader for motivation. 

Let Le be a set of possible states for the environment and let Li be a set of 
possible local states for agent 1. We take Q — L^x Li to be the set of global states. 
Let Ai and A^. be nonvoid sets of actions for agent 1 and for the environment, 
respectively. (These sets usually contain a special null action A.) A joint action 
is a pair (oe,ai) ^ A = A^ x A\. A run over Q and A is a pair r = {h,a) of 
infinite sequences: a state history h : N — > S, and an action history a : N — > A. 
Intuitively, for c G N, h{c) is the global state of the system at time c and a{c) is 
the joint action occurring at time c. (We say more about the transition relation 
connecting states and actions later.) A system over S and A is a set of runs over 
S and A, intuitively representing all possible histories. A pair (r, c) consisting 
of a run r (in system S) and a time c G N is called a point (in S). We write 
Points(S') for the set of points of S. Let Prop be a set of propositional variables. 
An interpretation of a system S' is a mapping tt : Prop — > associating 

a set of points with each propositional variable. Intuitively, proposition p G Prop 
is true exactly at the points contained in 7r(p). An interpreted system (over S 
and A) is & pair J = (S, tt) where S is a system over S and A and tt is an 
interpretation of S. 

The structure in the above definitions supports the following notions used to 
define the agent’s knowledge. We say two points (r, c), (r', c') in a system S are 1- 
indistinguishable, denoted (r, c) ~i (r',c'), if the local components of the global 
states at these points are equal, i.e., if there exists a local state si G Li and 
states of the environment Se,s(. such that h{c) = (se,si) and h'(c') = (Sg,si), 
where r = {h, a) and r' = (ft,', a'). A set P of points of S is 1-local if it is closed 
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under ~i, in other words, when for all points (r, c), (r', c') of S, if (r, c) G P and 
(r, c) ~i (r'j c') then (r', c') S P. Intuitively, 1-local sets of points correspond to 
properties that the agent is able to determine entirely on the basis of its local 
state. If 7T and tt' are interpretations and p G Prop, then tt' is said to be a 1-local 
p-variant of tt, denoted tt tt', if tt and tt' differ at most in the value of p and 
tt'{p) is 1-local. 11 3 — (S', tt) and 3 = (S',7t') are two interpreted systems over 
S and A, then 3' is said to be 1-local p-variant of 3, denoted 3 3' , if S = S' 

and 7T 7 t'. 

The logical language L we use in this paper resembles a restricted monadic 
second order logic with two additions: (a) an S5-modality for necessity and (b) 
operators from the linear time temporal logic LTL [^. Its syntax is given by: 

LBcj)::=p I -^(j) I I Neccj) \ Vip(^) \ Q(t> \ (t> ^ (t> \ Q(l^ \ (t> ^ (t> 

where p G Prop. Intuitively, Nec^ says that (jj is true at all points in the in- 
terpreted system, and its dual Poss^ = ^ Nec^^ states that 4> is true at some 
point. The formula Vip {(p) says that (p is true for all assignments of a 1-local 
proposition (set of points) to the propositional variable p. We write 3ip (p) for its 
dual ^Vip (->(()). The remaining connectives have their standard interpretations 
from linear time temporal logic: O (“next”), U (“until”), 0 (“previously”) and 
S (“since”). We employ parenthesis to indicate aggregation and use standard 
abbreviations such as true, false, V, and definable future time operators like □ 
(“henceforth”) and ^ (“eventually”), as well as their past time counterparts El 
(“until now”) and <$> (“once”). 

Formulae of L are interpreted at a point (r, c) of an interpreted system 3 = 
(S, tt) by means of the satisfaction relation defined inductively by: 

— 3, (r, c) \=piS (r, c) € tt{p); 

— 3, {r, c) 1= iff 3, (r, c) ^ p; 

— 3,{r,c) \= p Ap iff 3, (r, c) \= p and 3, (r, c) |= p; 

— 3, (r,c) 1= Nec(/> iff 3, (r',c') )= p, for all {r',c') G Points(S'); 

— 3, (r, c) 1= Vip (p) iff 3', (r, c) \= p for all 3' such that 3 ~p 3'; 

— 3, (r, c) \= OP iff 3, (r, c+l)'^p-, 

— 3, (r, c) 1= p\ip iff there exists a d > c such that 3, (r, d) \= p and 3, (r, e) \= p 
for all e with c < e < d; 

— 3, (r, c) 1= Qp iff c > 0 and 3, (r, c — 1) \= p', 

— 3, (r, c) 1= pSp iff there exists & d <c such that 3, (r, d) \= p and 3, (r, e) \= p 

for all e with d < e < c. 

Given these constructs, it is possible to express many operators from the 
literature on reasoning about knowledge. For example, consider the standard 
knowledge operator K\, defined by 3, (r, c) \= K\p ii 3, {r' ,c') \= p for all points 
(r', c') of 3 such that (r, c) (r', c'). This is expressible as 3\p {p A Nec(p — *■ p)). 
We refer to |3] for further examples and discussion. 
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3 Sequential Programs with Quantification over Local 
Propositions 

In this section we define our wide spectrum programming language, and discuss 
its semantics. We also define a refinement relation on programs. 



3.1 Syntax 

The programming language describes the structure of segments of runs. Let CV 
be a set of constraint variables and PV & set of program variables. Define the 
syntactic category Prg of programs by 

PrgBP::=e | Z | a | P*P \ P + P \ | 3^p{P) \ \ [</.]"" | 

where Z G PV, a G Ai, p G Prop, (p,ip G L, X G CV , and C C CV . The intu- 
itive meaning of these constructs is as follows. The symbol e denotes the empty 
program, which takes no time to execute, and has no effects. Program variables 
Z are placeholders used to allow substitution of programs. Note that a program 
may refer directly to actions a of the agent, but the actions of the environment 
are left implicit. The operation * represents sequential composition. The symbol 
+ denotes nondeterministic choice, while P“ denotes zero or more (possibly in- 
finitely many) repetitions of P. The construct 3\p (P) can also be understood 
as a kind of nondeterministic choice: it states that P runs with respect to some 
assignment of a 1-local proposition to the propositional variable p. The last three 
constructs are like certain constructs found in refinement calculi. Intuitively, the 
specification [fi, tjj]^ states that some program runs in this location that has the 
property that, if started at a point satisfying fi, eventually terminates at a point 
satisfying ^/>0The coercion is a program that takes no time to execute, but 
expresses a constraint on the surrounding program context: this must guarantee 
that (j) holds at this location. The constraint variable X in specifications and 
coercions acts as a label that allows references by other pieces of program text. 
Specifically, this is done in the assertions {4>}ci which act like program annota- 
tions: such a statement takes no time to execute, and, intuitively, asserts that </> 
can be proved to hold at this program location, with the proof depending only on 
concrete program fragments and on specification and coercion statements whose 
labels are in C. We may omit the constraint variables when it is not necessary 
to make such references. 

In programs binds tighter than “-I-” . We employ parentheses to indicate 
aggregation wherever necessary and tend to omit * near coercions and asser- 
tions. Moreover, we use the following abbreviations: if ^ then P else Q fi = 

P+[-^(j)]^ Q and while ^ (j> do P od = . Our programming 

^ In refinement calculi, such statements are typically associated with frame variables, 
representing the variables allowed to change during the execution — we could add 
these, but omit them for brevity. 
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language can express some programs closely related to the knowledge-based pro- 
grams of [Ij. These are program such as: 

case of 

if Kicf) do tti 

if do 02 

end case 

A program closely related to this is {[K\4>] oi -I- 02 -I- 

A)‘^ . The precise relationship is subtle and deferred to 
the full version of this paper. 

3.2 Semantics 

Our semantics will treat programs like specifications of certain sets of run seg- 
ments in a system, intuitively, the sets of run segments that can be viewed as 
having been generated by executing the program. We note that the semantics 
presented in this section treats assertions {^}c as equivalent to the null program 
e — the role of assertions in the framework will be explained later. 

We first define execution trees, which represent unfoldings of the nondeter- 
minism in a program. It is convenient to represent these trees as follows. A binary 
tree domain is a prefix-closed subset of the set {0, 1}*U{0, 1}“. So, each nonvoid 
tree domain contains the empty sequence A. Let A be a set. An A-labelled binary 
tree is a function T from a binary tree domain D to A. The nodes of T are the 
elements of D. The node A is called the root oi T. If n G D we call T{n) the 
label at node n. li n G D then the children of n in T are the nodes of T (if any) 
of the form n ■ i where i G {0, 1}. Finite maxima in the prefix order on D are 
called leaves of T. 

An execution tree is a Pr^-labelled binary tree, subject to the following con- 
straints on the nodes n: 

1. If n is labelled by e, a program variable Z G PV, a, basic action a, a specifi- 
cation , a coercion [<^]^, or an assertion {<^}p, then n is a leaf. 

2. If n is labelled by (P) then n has exactly one child n ■ 0, labelled by P. 

3. If n is labelled by P * Q or P -I- Q then n has exactly two children n • 0, n • 1, 
labelled by P and Q respectively. 

4. If n is labelled by P“ then n has exactly two children, n • 0, n • 1, labelled 
by e and P * (P“), respectively. 

With each program P we associate a particular execution tree, Tp, namely the 
unique execution tree labelled with P at the root A. 

We now define the semantic constructs specified by programs. An interval in 
a system S is a triple r[c, d] consisting of a run r of S' and two elements c and d 
of N+ = N U { 00 } such that c < d. We say that the interval is finite if d < 00 . 
A set I of intervals is run-unique if r[c, d], r[c', d'] G I implies c= c' and d = d'. 
An interpreted interval set over S (or Us for short) is a pair (tt, I) consisting of 
an interpretation tt of S and a run-unique set / of intervals over S. 
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We will view programs as specifying, or executing over, interpreted interval 
sets, by means of certain mappings from execution trees to interpreted inter- 
val sets. To facilitate the definition in the case of sequential composition, we 
introduce a shorthand for the two sets obtained by splitting each interval in a 
given set / of intervals of S in two. Say that / : / — *■ N-|_ divides I whenever 
c < /(^[C)C^]) ^ d, holds for all r[c, d] S I. Given some / dividing J, we write 
f^{I) for the set of intervals r[/(r[c, d]), d] such that r[c, d] G I. Analogously, we 
write /►(/) for { r[c, /(r[c, d])] | r[c,d] G / }. 

Let 5 be a system, let (tt,/) be an iis w.r.t. S, and let P be a program. 
A function 9 mapping each node n of Tp to an iis (7rg(n), /^(n)), respectively, 
is an embedding of Tp in (tt,/) w.r.t. S whenever the following conditions are 
satisfied: 

1. 9{X) = {7t,I). 

2. If n is labelled e or {(j)}c, then c = d for all r[c, d] G Ie{n). 

3. If n is labelled a then, for all (d, a)[c, d] G Ie{n), if c < oo then both d = l-|-c 
and a = oi, where a(c) = (ae,oi). 

4. If n is labelled [<^,'0], then, for all r[c, d] G l 0 {n), whenever c < oo and 
{S, 7Te(n)), (r, c) 1= (j)-, then both d < oo and (S', 7rg(n)), (r, d) ^ 'ip. 

5. If n is labelled [(/>], then c < oo implies that c = d and (S, 7rg(n)), (r, c) ^ 4>, 
for all r[c, d] G Ie{n). 

6. If n is labelled 3ip(Q) then 7rg(n) 7re(n • 0) and Ig{'n ■ 0) = Ie{'n). 

7. If n is labelled Qi + Q 2 , then Trg{n • 0) = 7re(n • 1) = ng{n) and Ie{n) is the 
disjoint union of Ie{n ■ 0) and Ie{n ■ 1). 

8. If n is labelled Qi * Q 2 , then Trg{n ■ 0) = ■ 7 Tg{n ■ 1) = 7rg(n) and there is an / 
dividing Ig{n) such that Ig{n ■ 0) = f^{Ig{n)) and Ig{n ■ 1) = f.^{Ig{n)). 

9. If n is labelled then TTg{n • 0) = 7T6i(n • 1) = TTg{n) and Ie{n.) is the disjoint 
union of Ig{n ■ 0) and Ig{n ■ 1) (as in caseCJ and, for all r[c, d] G Ig{n): 
d=\_\{ d' I r[c', d'] G Ig{n- m) for some leaf n ■ m oiTp below n }. 

We write S, (tt, I) Ihg P whenever 6 is an embedding of Tp in (tt, I) w.r.t. S. Say 
that P occurs over (tt,/) w.r.t. S if there exists a 6 such that S, (tt, /) Ihe P. 

Clauses |T] to 0 formalize the intuitive understanding given above for each of 
the program constructs. Concerning clause l9] of this definition, we remark that, 
by run-uniqueness and the other clauses, if n • toq, n • mi . . . are the leaves n ■ m 
below n for which Ig{n ■ m) contains an interval on r, in left to right order, and 
these intervals are r[co, do], r[ci, di], . . . , respectively, then we have di = Ci+\ for 
each index i in the sequence. (We may have Cj = di.) Intuitively, if d were not 
the least upper bound d' of the di, then this sequence of intervals would amount 
to an execution of over r[c, d'j rather than over r[c, dj. (See pSj for further 
motivation.) 



3.3 Refinement 

The semantics just presented can be shown to be a generalization of the semantics 
of jlti] for a similar language without the local propositional quantifier. That 
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semantics, however, dealt with single intervals where we have used a set of 
intervals. The motivation for the change is that certain undesirable refinement 
rules involving the local propositional quantifier would be valid under the earlier 
semantic approach. We now present two definitions of refinement and an example 
that motivates the richer semantics. 

Intuitively, a program P refines Q if, whenever P executes, so does Q. A 
refinement relation of this type, when transitive and preserved under program 
composition, allows us to start with a high level specification and derive a con- 
crete implementation through a sequence of refinement steps. 

One refinement relation definable using our semantics as is follows: P re- 
fines Q, denoted P Q Q when for all systems S, and interpreted interval sets 
(tt, /) over S, if S, (tt,/) Ih P then S, (tt,/) Ih Q. For the semantics using single 
intervals, the corresponding relation would be defined hy P Q* Q when for all 
systems S, interpretations tt and intervals r[c, d] of S, if S, (tt, {r[c, d]}) Ih P then 
S, (tt, {r[c, d]}) Ih Q. Clearly, ii P Q Q then P C* Q. As the following example 
demonstrates, the converse is false. 

Example 1. Let (/) G C be any formula and consider the following two programs. 

P = if ^ then a else a* a R Q = 3\p (if p then a else a * a fl) 

We shall first show that P Q* Q and then argue that this is not desirable. 
Suppose S, (tt, {r[c, d]}) Ih P. Recall that an if statement abbreviates a nonde- 
terministic choice. Thus, there are two cases to be considered: 

Case 1: S', (tt, {r[c, d]}) Ih [(f)] a. Define the 1-local p-variant tt' of tt by 
7 t'(p) = Points(S), that is, p is everywhere true under tt' . It follows that 
S, (tt', {r[c, d]}) Ih [p] a, and thus, S, (tt', {r[c, d]}) Ih if p then a else a * a fl. 
By definition, S, (tt, {r[c, d]}) Ih Q. 

Case 2: S, (tt, {r[c, d]}) Ih [->(/)] a * a. This is handled analogously by defining 
7 t'(p) = 0. 

To see that it is not the case that P Q Q, take (/) to be a propositional vari- 
able q. It is straightforward to construct a system S, finite intervals i = r[c, d] 
and i' = r'[c',d'], and interpretation tt such that S, (tt, {t}) Ih [q] a and 
S, (tt, {!'}) Ih [^q]a*a. Hence S, (tt, {i, z'}) Ih if q then a else a* a fi), 
but (r, c) and (r',c') are 1-indistinguishable. If we were to have S, (tt, {z,z'}) Ih 
3ip (if p then a else a * a fi), then we would have a 1-local p- variant tt' of tt such 
that S, {tt' , {z, z'}) Ih if p then a else a* a fi. But by assumption (r, c) G 7r'(p) iff 
(r', c') G 7t'(p), so we have either S, {tt' , {z, z'}) Ih a or S', {tt' , {z, z'}) Ih a * o. But 
neither of these is possible, since one or the other interval has the wrong length. 

Our intuition in writing Q is that it specifies a program that chooses to do 
either a or a * a on the basis of some locally computable test p. The refinement 
P C* Q is contrary to this intuition: it states that Q may be implemented 
by using in place of p any test, even one not locally computable. Intuitively, 
this result is obtained by using a different 1-local test in different executions of 
the program. Our semantics has been designed so as to avoid this: it ensures 
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that a uniform test p is used in every execution of the program. Thereby, the 
undesirable refinement is blocked. 

We remark that a slight variant of the example is a valid, and desired re- 
finement: [3ip(Nec(p = (^))] P C Q. Here, the coercion states that (p is in fact 
equivalent to a 1-local proposition. We will use this rule below. □ 



4 Validity and Valid Refinement 

We now briefly discuss the role of assertions {4>}c framework and define 

the associated semantic notions. The reader is referred to [l^ for a more detailed 
explanation of these ideas in a simpler setting. 

Intuitively, an assertion {4>}c is like an annotation at a program location 
stating that (p is guaranteed to hold whenever the program execution reaches 
this location. Moreover, such an assertion states that this fact “depends” only on 
constraints in the program (specifications and coercions) labelled with constraint 
variables in the set C, as well as on concrete program fragments. (We do not 
include labels for these because they cannot be “refined away”.) The reason we 
include the justification C for the assertion is that it proves to be necessary to 
track such information in order to be able to formulate a number of desirable 
refinement rules. These rules refine a program fragment in ways that depend 
upon the larger program context within which the fragment occurs. 

One typical example of this is a rule concerning the elimination of coercions. 
Suppose a coercion [cp] occurs at a program location where (p is guaranteed 
to hold. Intuitively, we would like to say that the coercion can be eliminated 
(replaced by e) in such circumstances. However, the attempt to formulate this 
by the refinement rule e < {(p} [<p] is not quite correct, for the reason the assertion 
holds could be the very coercion we seek to eliminate. (It may seem a little odd 
at first to say that the justification for the assertion is some part of the program 
text that follows, but consider the case oi (p = See m for an example that 
makes essential use of assertions justified by later pieces of program text.) The 
use of justifications enables us to formulate the rule as e < {<p}(y [<p]^ , provided 
X is not in C , i.e., provided the assertion does not rely upon the coercion. This 
blocks the circular reasoning. 

The semantics of assertions is formalized as follows. In order to capture con- 
straint dependencies, we first define for each program P and constraint set 
C C CV a program relax(P, C) that is like P, except that only constraints 
whose labels are in C are enforced: all other constraints are relaxed. Formally, 
we obtain relax(P, C) from P by replacing each occurrence of a coercion \(p] 
where X ^ C hy e, and also replacing each occurrence of a specification 
where X ^ C hy [false, true\^ in P‘^ . 

We may now define a program P to be valid with respect to a set of in- 
terpreted systems § when for all assertions {p}(^ in P, all interpreted systems 
(S', 7 t) e § and all intervals sets / over S, all embeddings 9 of T’reiax(p,c) 

S, (/, 7t) have the property that for all nodes n of T^eiax{p,c) labelled with 
we have S,6(n) Ih [p\. Intuitively, the embedding represents an execution of P 
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in which only constraints in C are enforced, and we check that the associated 
assertions hold at the appropriate points in the execution. Note that when n is 
labelled by an assertion, Ie{n) must be a set of intervals of length 0. Moreover, 
the semantics of S, (/, tt) Ih \(j>] checks (j) only at finite points in this set. Thus, 
validity can be understood as a kind of generalized partial correctness. We de- 
fine validity with respect to a set of interpreted systems § to allow assumptions 
concerning the environment to be modelled: e.g., § might be the set of all inter- 
preted systems in which actions have specific intended interpretations. We give 
an example of this in the next section. 

Clearly, we want to avoid programs that are not valid (such as [p]^ 

Thus, we would now like a notion of refinement that preserves validity, so that we 
derive only valid programs from valid programs by refinement. The refinement 
relation C defined above does not have this property. However, we may use it to 
define a notion that does. In order to do so, we first need to define a technical 
notion. A justification transformation is a mapping 77 : 2 ^^ — > 2 ^^ that is 
increasing, i.e., satisfies C C r]{C) for all C C CV . The result of applying a 
justification transformation 77 to a program P is the program Prj obtained by 
replacing each instance of an assertion {(j)}Q in P by the assertion When 

R{Z) is a program containing a program variable Z and P is a program, we write 
Rr]{P) for the result of first applying 77 to R{Z) and then substituting P for Z. 
We need such transformations for refinements such as replacing by e 

when X ^ C within some large program context. Intuitively, when we do this, 
any assertion in the larger context that depended on the coercion labelled X is 
still valid, but its justification should now include C in place of X. 

The identity justification transformation is denoted by l. We will also repre- 
sent justification transformations using expressions of the form X ^ D, where 
X G CV and D C CV . Such an expression denotes the justification transforma- 
tion 77 such that 77(C) = C U D ii X G C and 77(C) = C otherwise. 

Let § be a set of interpreted systems, let 77 be a justification transformation 
and let P and Q be programs. Say that P validly refines Q in % under 77, and 
write P Q, if for all programs R{Z) with Z a program variable, if R{Q) 
is valid with respect to § then Rrj{P) is valid with respect to §, and for all 
(S', 7 t) G § and interval sets / over S, if S, (I, tt) Ih Rrj{P) then S, (/, tt) Ih R{Q). 

We remark that other definitions of valid refinement are possible. While in- 
tuitive, the definition above is very sensitive to the syntax of the programming 
language. We will consider some closely related semantic alternatives elsewhere. 

4.1 Valid Refinement Rules 

We now present a number of rules concerning valid refinement that are sound 
with respect to the semantics just presented, making no attempt at completeness. 
We focus on rules concerning the existential quantifiers, and refer to [Ej for 
additional rules concerning the other constructs, which are also sound in the 
framework of the present paper. 

The following rules make it possible for refinement to broken down into a 
sequence of steps that operate on small program fragments. (Only justification 
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transformation operate globally, but this can also be managed locally by means 
of appropriate data structures.) 

P<lQ,Q R P<lQ 

P <lor,' R RviP) R(Q) 

Reducing the amount of nondeterminism and introducing a coercion are sound 
refinement steps. 



P<fP + Q [</>]<fe 

Quantification over local propositional variables can be introduced, extracted 
from a coercion, and lifted to contexts. 

3ip(R) <f P if p not free in P i-lq 

3ip(H) <f pip ((/>)] ext-lq 

3ip{R{P)) <f R(3ip{P)) if p not free in R{Z) lift-lq 

Let P^ denote the program obtained from P by substituting formula (j) for all 
free occurrences of p in P, while taking the usual care of free variables in (j) by 
renaming clashing bound variables in P. 

[3ip {Nec{(j) = p))] P^ <f 3ip (P) inst-lp 



4.2 Single-Stepping Programs and Loops 

Reasoning about termination of a loop, say, while p do P od becomes easier 
when strict bounds on the running time of P are known. We present here a 
simple example of this phenomenon that is useful for the example we present in 
Sect.O More general rules can be formulated than the one we develop here. 

Say that program P is single-stepping, if S,(tt,I) Ih P and r[c,d] G I and 
c < oo imply that d = 1 + c, for all S, tt, and I. In a slightly broader syntax 
with existential quantification over arbitrary propositions, not just local ones, 
the fact that P is single-stepping could be expressed by: 

P 3p([0 first p] [trwe, first p]) . 

where first (/) is an abbreviation for (p A which holds exactly at the first 

point in a run that makes (p true. This notion can be combined with the usual 
pre/post-condition style of specifying P’s behaviour to specify that P is single- 
stepping and terminates in points satisfying pj when started in points satisfying 

p-. 

P 3p ^[O first p]^ [true, firstp A {Qp 
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Denote the RHS of the above by ss[cj>,tjj]^ . So Ih ss[(j),tjj]^ if for all 

r[c,d] G I, whenever c < oo and (S', tt), (r, c) ^ ({)■> then both d = c+1 and 
(S, tt), (r, d) ^ ij}. Observe that ss[(j), ijj]^ takes a single step regardless of whether 
(j) holds initially. Consequently, ss[(j),tp]^ is indeed single-stepping. Adding the 
single-stepping requirement yields a valid refinement: ss[(p,'tjj]^ <f [(/>, '0]^- The 
following rule for single-stepping loop bodies will be used in Section 

[0 ^ 0]^ 3ip I while ^ do ss['0 A ~^p, 0]^ od * <f [0, 0']^ i-ss-loop 

\[0Ap^0']^ / 

To apply this rule, one has to invent a (not necessarily local) loop invariant 0. 
Finding a concrete local guard is postponed via use of the existential quantifi- 
cation. Just as for ordinary sequential programs, the first and last coercion link 
the invariant to the pre- and postcondition of the specification that is to be 
implemented. The second coercion, [0 — > ensures termination of the loop. 

5 Example: Autonomous Robot 

In this section we discuss an example that closely resembles Example 7.2.2 in jd] 
which in turn has been inspired by the 1994 conference version of [2]. 

A robot travels along an endless corridor, which in this example is identified 
with the natural numbers. The robot starts at 0 and has the goal of stopping 
in the goal region {2,3,4}. To judge when to stop the robot has a sensor that 
reads the current position. (See Fig. |T]) Unfortunately, this sensor is inaccurate; 




goal region 
2 3 4 



6 



Fig. 1. Autonomous Robot 



the readings may be wrong by at most 1. The only action the robot can actively 
take is halting, the effect of which is instantaneous stopping. Unless this action 
is taken, the robot may move by steps of length 1 to higher numbers. Unless it 
has taken its halting action, it is beyond its control whether it moves in a step. 
Our task is now to design a control program for the robot such that: 

(safety) The robot only stops in the goal region. 

(liveness) The robot is guaranteed to stop eventually. 

A modest assumption about the environment is needed for the latter to be 
achievable. We insist that it is not the case that the robot sits still forever 
without moving forward or taking the halting action. 



126 Kai Engelhard!, Ron van der Meyden, and Yoram Moses 



To model these assumptions we introduce a system constraint reflecting the 
following conditions. Strictly speaking, our specification language L only con- 
tains variables that are interpreted as Boolean values but none for natural num- 
bers. It is possible to present this example only using propositions by sacrificing 
legibility. An extension of our framework to typed variables is straightforward 
and omitted here for brevity. Let § be the set of interpreted systems satisfying 
the following constraints. 

1. Initially, the robot’s position x is zero: init ^ a; = 0, where init abbreviates 
the formula -^Qtrue, which holds exactly in the initial points of runs. 

2. Proposition h is initially false and it is becomes true once the robot has 
halted. Halting is an irreversible action {h Qh) and means that the robot 
does not move anymore: /i — > a; = Q)x. 

3. Proposition m is true iff the robot moves in the current step. Moving means 
that the robot’s position is increased by one, otherwise it is unchanged: 
(m ^ a: -I- 1 = Qx) A ~^m x = Qx. 

4. If the robot has not halted it should move eventually: {^h) U {hV m). 

5. The robot’s sensor reading is s (an integer) and off by at most one from x, 
the actual position: a;— l<s<a;-|-l. 

6. Only the robot’s basic action halt immediately halts the robot. 

The variables and propositions mentioned in the constraints are reserved in the 
sense that quantification over them is is not allowed. Thus they essentially “be- 
have” the same in each (S', tt) € S. In the full paper we introduce a syntactic 
representation for such system constraints, give a formal semantics, and intro- 
duce valid refinement rules that exploit these constraints. These rules fall into 
two classes: assertion introduction rules and rules for specification implementa- 
tion by basic actions. A typical assertion introduction rule for this particular § 
is 



allowing one to assert a property of initial states in interpreted systems contained 
in §. For the halting action we would have 



For lack of space we have simplified and pruned the set-up to the above. We 
refer to “use §” instead of formal refinement rules at points of our derivation 
that refer to the rules omitted. 

In |3] a run-based specification of the system is given by a temporal logic 
formula equivalent to □(/! — > 5) A <^h, where g abbreviates being in the goal 
region, i.e., 2 < a: < 4. The two conjuncts respectively formalize the safety and 
liveness property from above. The main problem in finding the robot’s protocol 
is to derive a suitable local condition for halting. 

We formally derive a protocol for the robot from as abstract as possible 
a specification of the protocol. The point of departure of our derivation below 



{init ^ x = 0 A ^hj^ <f e 



( 1 ) 



halt <f ss[true, h Ax = 0x] . 



( 2 ) 
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merely states that the robot must eventually halt in the goal region when started 
in an initial state. 

[init, g f\h]^ 

>f (sequential composition [If)] ) 

[init,g]^ * [g,ghh]^ 

>f (use § to establish halt <f [g,g A h], cf. ([21)) 

[init, g A * halt 

>f ^ -loop! with loop invariant a; < 4 to prevent exiting the goal region) 

* 

[init — >■ a: < 4] * 

3ip ^[a;<4^ while ^ ^pdoss[a;<4Ap, a;<4]'^od [a;<4Ap^(7]^^* 

halt 

>f (use § as in to assert init — *■ a; < 4, eliminate coercion) 

3ip ^[a;<4— > while ^ ^pdoss[a;<4Ap, a;<4]^od [a;<4Ap^p]^^* 

halt 

At this point we select the local test p. The need to satisfy coercion x < 4Ap — > g 
together with the fact that the sensor reading differs from the position x by at 
most 1, leads naturally to the choice p = s > 2. 

>f dinst-lpD 

[3ip (Nec(p = (s > 2)))]^ * [a; < 4 ^ > 2]^ * 

while ^s<2doss[x<4As<2,a:<4]^ od *[a;<4As>2— halt 
>f (eliminate two coercions using §) 

[a; < 4 ^ •Q>s > 2]^ * while ^ s < 2 do ss[a; <4As<2,a;< 4]^ od * halt 
>f (use § for A ss[x <4As<2,a;< 4]^) 

[a: < 4 — > <Q>s > 2]^ * while ^ s < 2 do A od * halt 
>f (introduce coercion and strengthen coercion [1 6| ) 

[init]^ * [^s > 2]^ * while ^ s < 2 do A od * halt 

The coercion > 2 can be eliminated by reasoning about both the program 
and §. From the initial state predicate it follows that the loop begins in a state 
satisfying ~^h. The only action executed in the loop is A, which in § preserves 
the value of h. On termination of the loop the guard must be false, i.e., s > 2. 
In (the purely hypothetical) case the loop diverges the run satisfies which 
together with pointjH {^h) U {h\/m), allows us to conclude that the robot moves 
infinitely often. But this also implies that eventually s > 2. 



128 Kai Engelhard!, Ron van der Meyden, and Yoram Moses 



>f (use § and the loop) 

[inii^ * {0('5 > 2)}y * [0(s > 2)]'^ while^ s < 2 do yl od * halt 
^Xw{Y} (eliminate coercion) 

[init]^ while ^ s < 2 do d od * halt 

Finally, the rule 



[4>] P <fi 

P<§ [<(.,r/>]^ 

proves while ^ s < 2 do d od * halt [init,g A h]^ , yielding a concrete 

implementation. 

An alternative derivation from point onwards indicates how the knowledge- 
based approach could be modeled in our framework. Firstly we would choose just 
true as loop invariant. Secondly, instead of guessing the appropriate local exit 
condition s > 2 we would let the robot execute the loop until it knows that it 
is in the goal region, i.e., instantiate p with K\g. The derivation then proceeds 
as before till reaching the stage before eliminating the last coercion concerning 
completeness of the test: 

[init]^ * [^(ATig)]^ while^ ^K\g do d od * halt 

To develop this to an implementation, that is, eliminate [■()>{Kig)]^ , requires 
additional features to be introduced into the framework, so we will not pursue 
this here. 



6 Conclusion and Future Work 

We have sketched the main features of the first compositional refinement cal- 
culus incorporating an assertion language strong enough to express temporal 
and epistemic notions. While, as we have noted, some further features are re- 
quired to give a complete treatment of knowledge-based programs in the sense 
of |4], we already have enough expressiveness in the framework to be able to 
view knowledge-based programs as special cases of our more general programs 
using quantified local propositions. Moreover, the derivation we have presented 
at length is very much in the spirit of the knowledge-based approach. (Indeed, 
precisely the same implementation is derived in | 2 ].) In contrast to tests for 
knowledge, tests for local predicates satisfying some extra conditions are more 
likely, in general, to admit efficient implementations. In future work, we plan 
to extend the framework of this paper to multiple agents and asynchrony. Ulti- 
mately, we hope to achieve a highly expressive, flexible and abstract framework 
supporting the knowledge-based development of distributed systems. 
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Abstract. The notion of data type specification refinement is discussed 
in a setting of System F and the logic for parametric polymorphism 
of Plotkin and Abadi. At first order, one gets a notion of specification 
refinement up to observational equivalence in the logic simply by us- 
ing Luo’s formalism. This paper generalises this notion to abstract data 
types whose signatures contain higher-order and polymorphic functions. 
At higher order, the tight connection in the logic between the existence 
of a simulation relation and observational equivalence ostensibly breaks 
down. We show that an alternative notion of simulation relation is suit- 
able. This also gives a simulation relation in the logic that composes at 
higher order, thus giving a syntactic logical counterpart to recent ad- 
vances on the semantic level. 



1 Introduction 

The idea behind formal specification refinement is that a program is the end- 
product of a step-wise refinement process starting from an abstract high-level 
specification. At each refinement step some design decisions and implementa- 
tion issues are resolved, and if each refinement step can be proven correct, the 
resulting program is guaranteed to satisfy the initial specification. 

There are several frameworks in which to do this and several ideas of what it is 
for one specification to be a refinement of another. A prominent framework is that 
of algebraic specification; see [S] for a survey and comprehensive bibliography. 
But there has been substantial development in other fields as well, notably in 
type theory, where also ideas from algebraic specification have been expressed. 

This paper investigates specification refinement in a setting consisting of 
System F and relational parametricity in Reynolds’ sense p5|2dj as expressed 
in Plotkin and Abadi’s logic for parametric polymorphism m- This setting 
allows an elegant formalisation of abstract data types as existential types m 
Moreover, the relational parametricity axiom enables one to derive in the logic 
that two concrete data types, i.e. inhabitants of existential type, are equal if 
and only if there exists a simulation relation P! between their implementation 
parts. Together with the fact that at first order, equality at existential type is 
derivably equivalent to a notion of observational equivalence, this formalises the 
semantic proof principle of Mitchell pS]. This lifts the type-theoretic formalism 
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of refinement due to Luo [22) to a notion in the logic of specification refinement 
up to observational equivalence; a key issue in program development. 

In this paper, we discuss the above type-theoretic notion of specification 
refinement in more generality, i.e. we treat data types whose operations may 
be higher order and polymorphic. At higher order, the formal link between the 
existence of a simulation relation and observational equivalence breaks down. 
Our solution in the logic is to use an alternative notion of simulation relation 
based on a weaker arrow-type relation. This notion composes at higher-order, 
thus relating the syntactic level to recent and on-going work on the semantic level 
remedying the fact that logical relations traditionally used to describe refinement 
do not compose at higher order I17I18I21I20I321 . 

In p2] an account of algebraic specification refinement |, 381, 37) is mapped to 
the first-order type-theoretic refinement notion, and the two accounts of refine- 
ment are shown to coincide. Important issues in algebraic specification refine- 
ment, such as the choice of input sorts m and the stability of constructors 
I39I37I10) . are automatically resolved in the type-theoretic setting. Other work 
linking algebraic specification and type theory includes [2813412141140) . Relevant 
work using System F and parametricity includes [20|30) showing that the intro- 
duction of non-terminating recursion also breaks down the tight correspondence 
between the existence of a simulation relation and observational equivalence. 

In |12) a proof method from algebraic specification for proving observational 
refinements mm is imported into the type-theory logic by adding axioms postu- 
lating the existence of quotients and sub-objects. Work related to this is [23112]. 
The higher-order generalisation of this is to be found in H3|. 

Section [2I outlines the type theory. In Sect. [2] refinement is introduced in a 
first-order setting, and Sect. |4] generalises to higher-order and polymorphism. 

2 System F and the Logic for Parametric Polymorphism 

We briefly recall the parametric A-calculus System F, and sketch the accompa- 
nying logic of mm for relational parametricity on System F. It is this accom- 
panying logic that bears a relational extension rather than the A-calculus. See [T] 
for a more internalised approach. System F has types and terms as follows: 

T ::= a: I T ^ T I WX.T t :■= x \ Xx'.T.t \ tt \ AX.t \ tT 

where X and x range over type and term variables resp. However, formulae are 
now built using the usual connectives from equations and relation symbols: 

(f) ::= {t =A u) I R{t,u) | • • • | '^R<zAxB.(f> \ 3RcAxB.<p 

where R ranges over relation symbols. We write a[R,X,x\ to indicate possible 
and all occurrences of R, X and x in a, and may write a[p,A,t] for the result 
of substitution, following the appropriate rules concerning capture. 

A second-order environment consists of a type environment A and a term- 
environment r depending on A as usual. For notational convenience we will 
amalgamate environments into a single environment B. Judgements for type and 
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term formation are as usual. However, formula formation now involves relation 
symbols, and we therefore employ relation environments, viz. a finite sequence 
T of relational typings R C Ax B of relation variables, depending on A, and 
obeying standard conventions for environments. The formation rules for atomic 
formulae consists of the usual one for equations, and now also one for relations: 

r\-t:A, r\-u:B, T h T, ThRcAxB 

r,T \- R(t, u) Prop (also written tRu) 

The other formation rules for formulae are as one would expect. Relation envi- 
ronments will also be amalgamated into T. Relation definition is accommodated: 

T, x'.A,y:B h (j) Prop 
r h {x:A,y:B).(p G Ax B 

For example eq^ = {x: A,y. A).{x =a y). 

li pG Ax B, p' G A' xB' and p”[R\G A\Y]x B[Z], then complex relations are 
built by p — > p' C (A ^ A') x (R — > B') where 

{p^ p') = {f:A^ A',g:B ^ B').{Vx:AVx':B.{xpx' ^ {fx)p'{gx'))) 

and y{Y,Z,RGYxZ)p”[R] G {yY.A[Y])x{yZ.B[Z]) where 

V(y, Z,RgYxZ)p" '‘^{y.yY.A[Y],z:yZ.B[Z]).{yYyZyRGYxZ.{{yY)p"[R]{zZ))) 

One can now acquire further definable relations by substituting definable re- 
lations for type variables in types. For X = Xi,...,Xn, B = Bi,...,Bn, 
C = Ci,...,C„ and p = pi, . . . ,p„, where piGBiXC^, we get T[p]GT[B]xT[C], 
the action of T[X] on p, defined by cases on T[X] as follows: 

T[X] = X, : T[p] = p, 

T[X] = T'[X] T"[X] :T[p]= T'[p] T"[p] 

T[x] = yx'.r[x,x'] T[p] = y{Y,z,RGYxZ).r[p,R\ 

The proof system giving the consequence relation of the logic is natural de- 
duction over formulae now involving relation symbols, and is hence augmented 
with inference rules for relation symbols, for example we have for <1> a finite set 
of formulae: 

^ r.RzAxB <([R] ^ hr yRC Ax B ,(j)[R], PhpGAxB 

^hrVRcAxR . (j)[R] ^ hp (([p] 

We will usually conveniently omit the sequent symbol henceforth. One also 
has axioms for equational reasoning and f3rj equalities. Finally, the following 
parametricity axiom schema is asserted: 

Param : VFi, . . . , VF„Vm: {yX.T[X, Yi, . . . , P„]) . u{yX.T[X, eq^.^ , . . . , eqy;^])^ 

To understand, it helps to ignore the parameters Yi and expand the definition 
to get yu\(yX.T[X]) .yYyZyRGY xZ . u(Y) T[R\ u{Z) i.e. if one instantiates 
a polymorphic inhabitant at two related types then the results are also related. 
This logic is sound w.r.t. to the parametric PER-model of [3] and the syntactic 
parametric models of m. Crucially, we have the following link to equality: 
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Fact 1 (Identity Extension Lemma [SI])- For any T[Z], the following se- 
quent is derivable using Param. 

VZ.\/u,v:T . {u T[eq^] v ^ {u =t f )) 

Encapsulation is provided by the following encoding of existential types and 
the following pack and unpack combinators. 

3X.T[x] = yY.(yx.{T[x] ^ y) ^ y) 
packj,^x]-'^X.{T[X] 3X.T[X]) 

pack 7 -[^](A)(fmp/) AY.\f:yX.{T[X] Y).f{A){impl) 

unpack^[^]:(3XT[X]) ^ yY.{yX.{T[X] ^ Y) ^ Y) 
ur\packrp^^^(package){B){client) = package{B){client) 

We omit subscripts to pack and unpack as much as possible. Operationally, pack 
packages a data representation and an implementation of operators on that data 
representation. The resulting package is a polymorphic functional that given a 
client and its result domain, instantiates the client with the particular elements 
of the package. And unpack is the application operator for pack. 

Fact 2 (Characterisation by Simulation Relation |SI])- The following se- 
quent schema is derivable using Param. 

yZ.yu,v:3X.T[X,Z] . 

u=^x.T[x,z] "0 3A^ B .3a'.T[A^ Z\,h'.T[B ^ Z\.3Rd Ax B . 

M = (packAa) A v = {packBh) A a(T[i?, eq^])!! 

The sequent in Fact Instates the equivalence of equality at existential type with 
the existence of a simulation relation in the sense of [2^. From this we also get 

yZXw.3X.T[X,Z].3A.3a-.T[A,Z] . u= (packAa) 

Weak versions of standard constructs such as products, initial and final 
(co-)algebras are encodable in System F j^. With Param, these constructs are 
provably universal constructions. We can e.g. freely use product types. Given 
pCAxB and p' C A' xB', {p x p) is defined as the action {X x X')[p,p']. One 
derives yu: Ax A', v.BxB' . u{pxp')v (fst(u) p fst(u) A snd(u) psnd(r;)). We 
use the abbreviations bool = yX.X ^ X ^ X, nat = yX.X ^ {X ^ X) X, 
and list(A) = yX.X ^ {A ^ X ^ X) X. These inductive types are provably 
initial constructs. 



3 Data Type Specification and First-Order Results 

Existential types provide a nice way of specifying abstract data types [27]. In 
System F and the accompanying logic of m, this mode of specification leads to 
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specification up to observational equivalence, where the latter is defined w.r.t. 
some given finite set Obs of closed inductive types for which the Identity Exten- 
sion Lemma (Fact [TJ gives x(C[p])y x =c y- Examples are bool and nat. 
In the following we shall use record type notation as a notational convenience. 

Definition 1 (Abstract Data Type Specification). An abstract data type 
specification SP is a tuple 



{{Sigsp,Osp), Obs) 

where Siggp = 3 X.%sp[X], for %sp[X] = Record {fpTi, . . . ,fk-Tk), 
and where 0sp{u) = 3A.3j::T5p[A] . u = (packAy) A d>sp[X,f]. 

If 0sp{u) is derivable, then u is said to be a realisation of SP. 

Example 1. For example Stack = ((S'i^stack; ®Stack), {nat}), where 

'S'*5stack = 3A.Tstack[A], 

Tstack[A] = Record{empty: X, push: nat x A ^ A, pop: A — > A, top: A — > nat), 
0Stack(u) = 3A.3y:Tstack[A] . u = (packAy) A 

Va;:nat,s:A . y.pop(y.push(a:, s)) = s A 
Vcc: nat, s: X . y.top(y.push(a;, s)) = x ^ 

We reserve T[A] for the function-profile part of abstract data types 3A.T[A]. 
For brevity, in this paper we do not consider parameterised specifications and 
so assume A to be the only free type variable in T[A]. 

The notion of specification of Def. [T] resembles that of [22j . However, as we 
are about to see, the important difference is that here equality of data-type 
inhabitants is inherently behavioural, and implementation is up to observational 
equivalence. In analogy to the meta-level notion in [25j . we define observational 
equivalence in terms of observable computations in the logic as follows. 

Definition 2 (Observational Equivalence (ObsEq)). Define observational 
equivalence ObsEq w.r.t. Obs in the logic by 

ObsEq (u:3A.T[A],u:3A.T[A]). 

(3A, H.3o:T[A], 6:T[i3] . M = (packAa) A u = (packiJb) A 

AceObs V/:VA.(T[A] ^ C) . {fAa) = {fBb)) 

The first result is essential to understanding the notion of specification in Def. [T] 
It is a syntactic counterpart to a semantic result in ISM- 

Theorem 3 (p!2j). Suppose ((3A.T[A], 0), Obs) is an abstract data type speci- 
fication such that T[A] only contains first-order function profiles. Then, assum- 
ing Param, equality at existential type is derivably equivalent to observational 
equivalence, i.e. the following is derivable in the logic. 

Vu,u:3A.T[A] . u=^x.'i[x] v ^ u ObsEq v 
Proof: This follows from Fact [2] and Theorem m below. □ 
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Theorem 4 f[12p. Let 3AT.T[Ai] be as in Theorem\^ Then, assuming Param, 
the existenee of a simulation relation is derivably equivalent to observational 
equivalenee, i.e. the following is derivable. 

yA,B.ya:‘I[A],b:‘I[B] . 

3RcAxB .a{%[R])b o Aceo;,. V/:VX(T[X] ^ C) . (/A a) = (/S 6) 
Proof: This follows from Param. 

<i=: We must exhibit an R such that o(T[i?])6. Semantically, |25I26I39| relate 
elements iff they are denotable by some common term. We mimic this: For R 
give DfnbI {a:A,b:B).{3f:yX.{T[X] ^ X).{fAa) = a A (fBb) = b). □ 

Given Theorem |3] 0sp{u) of Def. |T]expresses “it is observationally equivalent to 
a package (packATy) that satisfies the axioms L>sp" ■ Hence specification according 
to Def. [1] is up to observational equivalence. 

Notice that there is nothing hindering having free variables in an observable 
computation f:yX.{‘I[X] — s- C). Importantly, though, these free variables can 
not be of the existentially bound type. 

Example 2 ([IS])- Consider specification Set = ((S'zgsetJ ®Set), nat}), for 
%set = 3X.Tset[X], 

= Reeord{empty: X, add: natxX^X, remove: natxAT^Ai, in: natxAT— >bool), 
0Set(M) = 3Ai.3j::Tset[-^] ■ u = (packXy) A 

Va;:nat, s:3f . y.add(a;, y.add(x, s)) = y.add(a:, s) A 
yx,y:nat,s:X . y.add(a;,j:.add(?/, s)) = y.add(?/, y.add(a;, s)) A 
Vx:nat . y.in(a;,j:. empty) = false A 

yx,y:nat,s:X . y.in(x, y.add(i/, s)) = if a; =nat y then true else y.in(a;,s) A 
Vx: nat, s: X . y.in(x, y.remove(x, s)) = false 

Consider the data type LI = (pack list(nat) [): S’lJset) where [.empty gives the 
empty list, [.add adds a given element to the end of a list only if the element does 
not occur in the list, [.in is the occurrence function, and [.remove removes the first 
occurrence of a given element. Typing allows users of LI to only build lists using 
[.empty and [.add, and on such lists the efficient [.remove gives the intended result. 
Crucially, any closed observation /: VAl.(Tset[-^] ^ C), C € Obs can only refer 
to lists built using [.empty and [.add. For example, in the observable computation 
ylAl.Aj::Tset[-^] • y.in(x, y.remove(x, g)), where g is a term of the bound type X, 
the typing rules insist that g can only be of the form y.add(- • • y.add(j:. empty) • • •) 
and not a free variable. This implies through Theorem El that LI is a realisation 
of Set according to Def. [H 

In the world of algebraic specification, there is no formal restriction on the 
set In of so-called input-sorts. Thus, if one chooses the set of input sorts to 
be In = {set, bool, nat}, then in(x, remove(x, s)) where s is a variable, is an 
observable computation. This computation might give true, since s ranges over 
all lists. In algebraic specification one has to explicitly restrict input sorts to not 
include the abstract sort, in this case set, when defining observational equivalence 
|T?nj . whereas the type-theoretic formalism deals with this automatically. O 



136 Jo Erskine Hannay 



The idea of specification refinement up to observational equivalence can now 
be expressed straight-forwardly by simply using the notion of refinement in [22j . 



Definition 3 (Type Theory Specification Refinement). A specification 
SP' is a refinement of specification SP, via constructor F: Sig gpi — > Sig gp if 

'iu'.Siggpi . 0sp'{u) => 0sp(Fu) 
is derivable. We write SP SP' for this fact. 

The notion of constructor F: Siggpi —>■ Siggp in Def. El is based on the notion 
of parameterised program m- Given a program P that is a realisation of SP', 
the instantiation F{P) is then a realisation of SP. Constructors correspond to 
refinement maps in m- It is evident that the refinement relation of Def. El is in 
a sense transitive, i.e. we have vertical compos ability m-- 

SP'^SP' and SP'ySP" ^ SP;^,SP'' 

where F o F' = Xu: Sig gpn .F{F'u). In terms of algebraic-specification, any con- 
structor F: Sig gpi — *■ Sig gp is by Theorem inherently stable under parametric- 
ity: Congruence gives yu,v: Sig gp, . u =Siggp> ^ ^ =Siggp And 

equality at existential type is of course observational equivalence. 

Relating data types by simulation relations is often called data refinement. 
There are thus two refinement dimensions; one concerning specifications, and 
within each stage of this refinement process, a second dimension concerning ob- 
servational equivalence, i.e. simulation relations, i.e. data refinement. At first 
order, theorems 0 and |4] give the essential property that the existence of simula- 
tion relations is transitive, but we can actually give a more constructive result: 



Theorem 5 (Composability of Simulation Relations). Suppose T[A] only 
contains first-order function profiles. Then we can derive 

VA, R, G, i? C A X R, S' c R X G, o: T[A] , 6: T[R] , g: T[G] . 

a(T[R])6 A 6(T[S])g ^ a(T[SoR])g 



4 Higher Order 

If T[X] has higher-order function profiles. Theorem [H fails due to DfnbI not 
extending to a logical relation. Theorem E] fails as well, and indeed we cannot 
even derive that the existence of simulation relations is transitive. 

The solution we present here is based on an alternative notion of simu- 
lation relation, and is motivated as follows. Consider the higher-order signa- 
ture 3X.Record{f : {X X) — > nat,g: X X). One requirement for an 
R C Ax B to be respected in the standard sense by two implementations a 
and b, is that \/S: A A,\/'j: B ^ B . S{R R)y =A a.f{S) =nat ^./(y). 
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But since / is defined within a package, / should be specific to that pack- 
age, and f’s behaviour on elements outside the package should be irrelevant. 
Therefore the proof obligation should not have to consider the behaviour of 

a. f and 6./ on arbitrary operators 5\ A ^ A and B ^ B as long as their 
behaviour satisfies the requirement for operators defined in terms of a.g and 

b. g and operators of globally accessible types. This view is partly what the type 
system promotes through existential types: Operationally, the only way two con- 
crete data types (packAo) and (packiJb) can be used is in clients of the form 
AX.Xf. Record{f: {X X) ^ nat,g-.X X) . t. Such a client can incite the 
application of a.f and b.f to a.g and b.g resp., but not to arbitrary 5\A^A and 
7:5 ^ B. Existential types therefore provide an abstraction barrier to which 
the standard definition of type relations is in a certain sense oblivious, and we 
suggest altering the relational proof criteria accordingly. 

As before T[X] denotes the body of an abstract data type 3X.^[X], now 
possibly with higher-order and polymorphic profiles. We shall assume that 

adt: T[Ai] = Record{fi:Ti[X], . . . , fk-Tk[X]), where each fi:Ti[X] is in uncur- 
ried form, i.e. Ti[X] is of the form x • • • x Tn-[X] Tc-[X], where 

Tci is not an arrow type. If Tc- [X] is a universal type, then Tci [X] G Obs. 



4.1 The Alternative Simulation Relation 

For brevity we will abuse vector notation. For a fc-ary vector Y , we write e.g. 
\/Y for the string VYi . VF 2 • ■ • ■ VYfc , and similarly for AY .lik = Q then the above 
all denote the empty string. The first I components of Y are denoted by Y\i. 

Definition 4 (Data Type Relation). For T[A], for k-ary Y, l-ary, I > k, 
E, F, pcExF, A, B, RcAxB, a:1[A], 6:T[5], we define the datatype 
relation U[p,R]* inductively by 

U = X ■.U[p,R]* = R 

U = Y : C/[p,R]* =' p, 

U = WXhU'[Y,X\X] : U[p,R]^ 

V(5i+1, F;+1, C Ei+i X Fi+i){U'[p, pi+i,R]*) 

U = U'^U" :U[p,RY = 

{g-.U'[E,A\ ^ U"[E,A], h-.U'[F,B] ^ U"[F,B\) . {yx-.U'[E,A],yy.U'[F,B\ . 

{xU'[p,RYy A Dfnbly,[y_^](a;,y)) ^ {gx) U”[p,RY {hy)) 

where 

Dfnbly/j-^ y) = 

3/: Vr .VA.(T[A] ^ U'[Y , A]) . {fE\kA a) = x A {fF\kB b) = y 
We usually omit the type subscript to the DfnbF clause. 

The essence of Def. [His that the arrow type relation is weakened with the DfnbF 
clause. This clause is an extension of the relation exhibited for the proof of 
Theorem IH We have conveniently: 
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Lemma 6. For T[X] satisfying adt, we can derive 

a(T[i?]*)6 Ai<,<fe a./, b.f. 

We also want the data type relation of Def. E]to retain the property of being the 
equality over types in Obs. This is not derivable, but since Obs contains only 
inductive types, we get a semantic justification for this property. 

Lemma 7. With respect to the parametric PER-model ofM it is sound to assert 
the following axiom schema for C G Obs. 

Ident: yx,y:C . x=cy ^ x{C[p]*)y 

Now with the alternative notion of simulation relation obtained from 

Def. m we obtain variants of Theorem Invalid also for higher-order function pro- 
files (theorems 0 and [I2| . However, this comes at a price, since we here choose 
not to alter the parametricity axiom schema. Consequently, we loose proof power 
when considering the alternative simulation relation in universal type relations, 
and we can no longer rely directly on parametricity, as in Lemma [H when de- 
riving observational equivalence from the existence of a simulation relation. 



4.2 Special Parametricity 

Our solutions to this is to validate semantically special instances of alternative 
parametricity sufficient to reinstate the necessary proof power. 

The special instances come in two variants, both based on the notion of 
closed observations. In shifting attention from general observable computations 
as proclaimed in Def. [2l to a notion of closed observations, we must now specify 
the collection In of input types in observations. (Compare this to the discussion 
around Example[2l) A sensible choice is to regard all types in Obs as input types, 
and henceforth In is assumed to contain this. 

In the following we write for instance (VA.T[A]* — s- U\X]*), meaning the 
relation V(A, 5, i?C A x H)(T[i?]* — > U[R]*). 

Lemma 8. For%[X] adhering to adt, for /:VA.(T[A] — > U[X]), for any U[X\, 
and where free term variables of f are of types in In, we can derive 

f (VA.T[A]* ^ U[XY) f 

By Lemma [S] the following axiom schema is sound w.r.t. any model whose inter- 
pretations of all /:VA.(T[A] ^ U[X]) are denotable by terms whose only free 
variables are of types in In. For T[A] adhering to adt , for any U[X], 

SPParam: V/:VA.(T[A] ^ U[X]) . f (VA.T[A]* ^ U[X]'^) f 
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And then using SpParam we get a general version of Theorem [H 
Theorem 9. Given SpParam, for T[Ai] adhering to adt, the existence of a 
simulation relation coincides with observational equivalence, i.e. we can derive 

VA,B.Va:T[A],6:T[B] . 

3i?cAxB . a(T[i?]*)6 AceOh. V/:VX(T[X] ^ C) . (/Aa) = (/B 6) 

Proof: =>: This follows from SPParam and Ident. 

<t=: We have to show that 3R C Ax B . o(T[i?]*)6 is derivable. We ex- 
hibit R= (a: A,b: B).(Dfnbl*(a, 6)). Due to the assumption adt, it suffices 
by Lemma [6] to show for every component g:U in ‘I[X] the derivability of 

yx:U[A],yy:U[B] . {x U[R\* y A Dfnbl*(a;, y)) => (a.gx) (b.gy) 

where V[X] is either some C S Obs, whence we recall Ident, or the variable X. 
Now DfnbP(a;,y) gives 3/[/: VA:.(T[X] ^ U[X\) . (fuAa) = x A (fuBb) = y. 
Let / AX.Aj::T[X] . (j:.y(/[/Xy)). 

V[X] = C G Obs: We may show that a.gx =c b.gy is derivable. The as- 
sumption gives if A a) =c {fB 6) which by /3-reduction gives the desired result. 

V[X] = X: We must derive 3f:\/X.{1[X] — > P[A1]) . {fAa) = {a.gx) A 
{fB 6) = {b.gy). For this we display / above. □ 

We also regain not only transitivity of the existence of simulation relations, but 
also composability of simulation relations. This relates the syntactic level to 
recent and on-going work on the semantic level, namely the pre-logical relations 
of |l7ll8j . the lax logical relations of mm. and the L-relations of m- 

Theorem 10 (Composability of Simulation Relations). Gwen SpParam, 
for T[Ai] adhering to adt, we can derive 

yA,B,G,RcAxB,ScBxG,a:‘l[A],b:‘l[B],Q:‘l[G]. 

a{%[R]*)b A 6(T[S']A0 ^ a(T[5 o i?]*)g 

Proof: Assuming a(T[i?]*) 6 A 6 (T[5]*)g, the goal is to derive for every compo- 
nent g:U ^ V in T[X] 

\/x:U[A],yz:U[G] . {x U[SoR\* z A Dfnbl*(a:, z)) ^ {a.gx) P[S'oi?]* {g.g z) 

By Dfnbl*(a;,z) we construct / = AX.Xp.1[X] . {f.g{fuXf)). 

V[X] = G G Obs: By assumption and Theorem 0 {fAa) = {fB 6) = (/Gg), 
and a.gx = {fAa) and (/Gg) = g.g z 

V[X] — X: We must show 35: U[B] . {a.gx) R b A b S {g.gz). Exhibit 
fBb — {b.g{fuBb)) for b. To show e.g. {a.gx) R (b.g{fuBb)) it suffices by 
assumption to show x U[R\* {fuBb) A Dfnbl*(a;, {fuBb)). But x = {fuAa), so 
Dfnbl*(a;, (/[/Rb)) is trivial and {fuAa) U[R\* {fuBb) follows by SpParam. □ 

As far as we know, it is not known whether or not the parametric PER-model 
of |2] satisfies SpParam, even for U\X] = C, C € Obs. We can however validate 
SpParam in the polymorphic extensionally collapsed syntactic models of [HI or 
the parametric term models of m. 
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4.3 Sticking to the Parametric PER-Model 

However, in this paper our preference is to continue to seek validation under the 
non-syntactic parametric PER-model of |3]. Semantically, observational equiva- 
lence is usually defined w.r.t. contexts that when filled, are closed terms. Thus a 
reasonable alternative definition in the logic of observational equivalence is the 
following. 

Definition 5 (Closed Context Observational Equivalence (ObsEqC)). De- 
fine closed context observational equivalence ObsEqC w.r.t. Obs in the logic by 

ObsEqC = {u:3X.‘I[X],v:3X.‘I[X]). 

(3H, H.zlo:T[H], 6:T[i3] . M = (packHa) A z; = (packiJb) A 

Ac 606 sV/:VX.(T[X] ^ C) . Closedp.„(/) ^ (/Ha) = {fBb)) 

where C\osedpin{f) is derivable iff h /. 

The idea is that closedness is qualified by a given context so as to allow for 
variables of input types in observable computations. Note that this was auto- 
matically taken care of in the notion of observational computations of Def [H 
The task is now to determine what the predicate Closed (/) should be. This 
is intractable in the existing logic, but we can easily circumvent this problem 
by introducing C\osedpin as a family of new basic predicates together with a 
predefined semantics as follows. 

Definition 6. The logical language is extended with families of basic predicates 
Closed^(r) ranging over types T, and Closed^(t,T) ranging over terms t: T, 
both relative to a given environment T. This new syntax is given a predefined 
semantics as follows. For any type T \- T, term T \- t.T, and evaluation 7 S |T], 

|=r .7 Closed^(T) 4^ exists some type T \- A, some 7 € |T] 

s.t. |r h Tj^ = [r h 

|=r .7 Closedp (t,T) 4^ exists some type T \- A, term T \- a: A, some 7 G IT] 
s.t. |r h T|t, = |T h H]y and |T h t:Tj-y = |r h a:A\^ 

Lemma 11. It is easily seen that the following axiom schemata are sound. 

1. h/’ Closed 7 . jj-(Jf) 

2. hr C\osedp{U) A Closed^]!/) ^ QosedplJJ V) 

3. hr Closedp ^([/) ^ Closed 7 (VJf.[/) 

4 .. hr Closed 7 ^.j^(a;, C/) 

5. hr Closed^ E) Closed 7 (Aa;:C/.t, C/ ^ E) 

6. hr Closed^lg, ^ E) A C\osedp{t,U) => C\osedp{gt,V) 

1. hr Closedp j(.(t, C/) Qosedp{AX.t,\/X.U) 

8. hr Closed^]/, VAf.f/[Af]) A ClosedplH) Closedpl/H, 17[H]) 
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9. hr Closed/, (T) ^ Closed/, (T), rcT 
10. hr Closed/ (t,r) Closed/, (t, T), F C r' 

We will usually omit the type argument in the term family of Closed. Intuitively, 
we should now be able to use Lemma |S]to show the necessary special parametric- 
ity instance. However, to make the induction spiral work, we have to strengthen 
lemmas IH] and El by incorporating Closed into the DfnbI* clause. 

Definition 7 (Data Type Relation by Closed Observers). Define the data 
type relation by closed observers U[p, R]q as the data type relation U[p, i?]* of 
Re/.0 but where we use 

DfnblC^[y,;,](a;,y) =' 3/: VY.VW(T[X] ^ U[Y,X]) . 

Closedr/~(/) A {fE\kAa)=x A {fF\kBb)=y 

in place of Dfnblr[y_x]; = ^i- ■ j CCn, Ui G In, 1 < i < m. 

Lemma 12. For T[Ai] satisfying adt, we have the derivahility of 
a(T[i?]J)6 Ai<i<fe 0. fi {TfiR]l) b.fi 

Lemma 13. With respect to the parametric PER-model of it is sound to 
assert the following axiom schema for C G Obs. 

IdentC: yx,y:C.x=cy ^ x{C[p]Q)y 

Lemma 14. For T[X] adhering to adt, for f: \/X.{fZ[X] U[X]), for any 

U[X], and where free term variables of f are of types in In, we can derive 

f (VX.T[X]J ^ U[X]l) f 

By Lemma EH it is sound w.r.t. the parametric PER-model to postulate the 
following axiom schema. For T[Af] adhering to adt , for F^"' = xp. U \, . . . , Xm' Um, 
Ui G In, 1 < i < m, for any U[X\, 

CspParam: V/:VW(T[X]^C/[X]) . Closedr/~(/) ^ / (VWT[A:]^^C/[A:]J) / 

We can now show the higher-order polymorphic generalisation of Theorem 2]now 
validated w.r.t. the parametric PER-model: 

Theorem 15. Extending the language with the predicates Closed of Def.\^ given 
CspParam, for T[X] adhering to adt, for = x\\ U\, . . . ,Xm- Um, Ui G In, 
1 < i < m, the following is derivable. 

VA,R.Vo:T[A],6:T[R] . 

3RcAxB . a(%[R\l)b ^ 

AceObs^f--^X.{%[X] ^ C) . OosedMf) => ifAa) = (fBb) 
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Proof: =>: This follows from CspParam and IdentC. 

4=: Along the lines of the proof of Theorem[9l and using LemmafTTl to obtain 
Closed (/) from Qosedpin(fij), and using Lemma IT^ and IdentC in place of 
Lemma[6] and Ident. □ 

We now get composability validated w.r.t. the parametric PER-model: 

Theorem 16 (Composability of Simulation Relations). Gwen CspParam, 
for T[A] adhering to adt, we ean derive 

VA, S, G, i? C A X S' c S X G, o: T[A] , 6: T[R] , g: T[G] . 

a(T[i?]J)6 A 6(T[S]^)g ^ a(T[SoR]*)g 

Proof: As for Theorem 1101 but using CspParam instead of SpParam. □ 

Finally we retrieve the notions of specification refinement. We have estab- 
lished the coincidence of observational equivalence and the existence of a simula- 
tion relation at higher order, but in this paper we do not tie the link to equality 
at existential type. This is of minor importance because we can simply redefine 
our notions in terms of ObsEqC (or ObsEq) instead of equality: The realisation 
predicate of Def. [T] then reads 0sp{u) = 3A.3y:T5p[A] . u ObsEqC (packAj:) A 
^ 5 p[A, j:]. Note that we now have to show the stability of constructors explicitly. 

5 Final Remarks and Discussion 

This paper has addressed specification refinement up to observational equiva- 
lence with System F using Plotkin and Abadi’s logic for parametric polymor- 
phism. At first order, specification refinement up to observational equivalence 
can be defined in the logic using Luo’s formalism, because equality at existential 
type coincides (Theorem [S]) with observational equivalence ObsEq (Def [2]). 

At higher order, i.e. when the data type signature has higher-order function 
types, we ostensibly loose the correspondence in the logic between observational 
equivalence and the existence of a simulation relation. We argued that at higher- 
order the usual notion of simulation relation is too strict, since it for function 
types requires that one consider arbitrary arguments, which might be other than 
those actually accessible in computations. 

Thus an alternative simulation relation was proposed based on the 

DfnbI* clause and data type relation (Def. E}. Then a correspondence in the 
logic between observational equivalence and the existence of this alternative 
simulation relation is re-established in any model in which the axiom schema 
SPParam is valid (TheoremEI). For the parametric PFR-model, we also achieve 
the correspondence (Theorem [T5|) by extending the logical language with basic 
predicates Closed^, defining a second alternative simulation relation and 

validating the axiom schema CspParam w.r.t the parametric PFR-model. Fi- 
nally, we achieve a simulation relation in the logic that composes at higher-order 
(theorems [TU] and fTHll . This relates to on-going work on the semantic level. 
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The approach taken in this paper is conservative in that we in the outset do 
not want to alter either the type theory nor the parametricity axiom schema. 
This is motivated by the view that it is the relational proof criteria specifically 
for abstract data types that need amending, not the type theory itself. The 
parametricity axiom is left alone in order to relate to established models for 
relational parametricity. However, there seem to be other interesting approaches 
worth looking into. One alternative would be to alter the type system so as 
to isolate separate types for use in abstract data types, and then extend the 
parametricity axiom schema to deal with these types. A very promising approach 
to finding a non-syntactic model satisfying SpParam seems to be to work along 
the lines of Jung and Tiuryn 1191 . and define a non-standard Kripke-like model 
to validate the logic. 
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Abstract. We propose an extension of the asynchronous 7r-calculus with 
a notion of random choice. We define an operational semantics which dis- 
tinguishes between probabilistic choice, made internally by the process, 
and nondeterministic choice, made externally by an adversary scheduler. 
This distinction will allow us to reason about the probabilistic correctness 
of algorithms under certain schedulers. We show that in this language 
we can solve the electoral problem, which was proved not possible in 
the asynchronous 7r-calculus. Finally, we show an implementation of the 
probabilistic asynchronous 7r-calculus in a Java-like language. 



1 Introduction 

The TT-calculus m) is a very expressive specification language for concurrent 
programming, but the difficulties in its distributed implementation challenge 
its candidature to be a canonical model of distributed computation. Certain 
mechanisms of the 7r-calculus, in fact, require solving a problem of distributed 
consensus. 

The asynchronous 7r-calculus am), on the other hand, is more suitable 
for a distributed implementation, but it is rather weak for solving distributed 
problems (i)- 

In order to increase the expressive power of the asynchronous 7r-calculus we 
propose a probabilistic extension, iTpa, based on the probabilistic automata of 
Segala and Lynch ([1^). The characteristic of this model is that it distinguishes 
between probabilistic and nondeterministic behavior. The first is associated with 
the random choices of the process, while the second is related to the arbitrary 
decisions of an external scheduler. This separation allows us to reason about ad- 
verse conditions, i.e. schedulers that “try to prevent” the process from achieving 
its goal. Similar models were presented in m and m- 

Next we show an example of distributed problem that can be solved with 
TTpa, namely the election of a leader in a symmetric network. It was proved in 
jS] that such problem cannot be solved with the asynchronous 7r-calculus. We 
propose an algorithm for the solution of this problem, and we show that it is 
correct, i.e. that the leader will eventually be elected, with probability 1, under 
every possible scheduler. Our algorithm is reminiscent of the algorithm used in 
HD! for solving the dining philosophers problem, but in our case we do not need 
the fairness assumption. Also, the fact that we give the solution in a language 

J. Tiuryn (Ed.): FOSSACS 2000, LNCS 1784, pp. 146- fTEUl 2000. 
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provided with a rigorous operational semantics allows us to give a more formal 
proof of correctness (the proof is omitted here due to space limitations, but the 
interested reader can find it in E). 

Finally, we define a “toy” distributed implementation of the TTpa-calculus 
into a Java-like language. The purpose of this exercise is to prove that TTp^ is a 
reasonable paradigm for the specification of distributed algorithms, since it can 
be implemented without loss of expressivity. 

The novelty of our proposal, with respect to other probabilistic process alge- 
bras which have been defined in literature (see, for instance, 1141 1. is the definition 
of the parallel operator in a CCS style, as opposed to the SCCS style. Namely, 
parallel processes are not forced to proceed simultaneously. Note also that for 
general probabilistic automata it is not possible to define the parallel operator 
or at least, there is no natural definition. In iTpa the parallel operator can 
be defined as a natural extension of the non probabilistic case, and this can be 
considered, to our opinion, another argument in favor of the suitability of iTpa 
for distributed implementation. 

2 Preliminaries 

In this section we recall the definition of the asynchronous 7r-calculus and the 
definition of probabilistic automata. We consider the late semantics of the tt- 
calculus, because the probabilistic extension of the late semantics is simpler 
than the eager version. 

2.1 The Asynchronous tt- Calculus 

We follow the definition of the asynchronous 7r-calculus given in |T], except that 
we will use recursion instead of the replication operator, since we find it to 
be more convenient for writing programs. It is well known that recursion and 
replication are equivalent, see for instance j5]. 

Consider a countable set of channel names, x,y,. . ., and a countable set of 
process names X,Y,. . .. The prefixes a,P, . . . and the processes P,Q, . . . of the 
asynchronous 7r-calculus are defined by the following grammar: 

Prefixes a ::= x{y) \ r 

Processes P ::= xy \ I I I I ^ I ^ecxP 

The basic actions are x{y), which represents the input of the (formal) name 
y from channel x, xy, which represents the output of the name y on channel x, 
and T, which stands for any silent (non-communication) action. 

The process represents guarded choice on input or silent prefixes, 

and it is usually assumed to be finite. We will use the abbreviations 0 (inaction) 
to represent the empty sum, a.P (prefix) to represent sum on one element only, 
and P + Q for the binary sum. The symbols vx and | are the restriction and the 
parallel operator, respectively. We adopt the convention that the prefix operator 
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has priority wrt + and |. The process recxP represents a process X defined 
as X P, where P may contain occurrences of X (recursive definition). We 
assume that all the occurrences of X in P are prefixed. 

The operators vx and y{x) are x-hinders, i.e. in the processes vxP and y{x).P 
the occurrences of a; in P are considered bound, with the usual rules of scoping. 
The free names of P, i.e. those names which do not occur in the scope of any 
binder, are denoted by fn{P). The alpha-conversion of bound names is defined 
as usual, and the renaming (or substitution) P[y/x] is defined as the result of 
replacing all free occurrences of a; in P by y, possibly applying alpha-conversion 
in order to avoid capture. 

The operational semantics is specified via a transition system labeled by 
actions p,, y! . . .. These are given by the following grammar: 

Actions p ::= x{y) \ xy \ x{y) \ r 

Essentially, we have all the actions from the syntax, plus the bound output x{y). 
This is introduced to model scope extrusion, i.e. the result of sending to another 
process a private (j^-bound) name. The bound names of an action p, bn{p), are 
defined as follows: bn{x{y)) — bn{x{y)) = {y}; bn{xy) = &n(r) = 0. Further- 
more, we will indicate by n{p) all the names which occur in p. 

The rules for the late semantics are given in Table [T] The symbol = used in 
Cong stands for structural congruence, a form of equivalence which identifies 
“statically” two processes and which is used to simplify the presentation. We 
assume this congruence to satisfy the following: 

(i) P = Q ii Q can be obtained from P by alpha-renaming, notation P =„ Q, 

(ii) P\Q = Q\P, 

(iii) recxP = P[recxP/X], 

Note that communication is modeled by handshaking (Rules COM and 
Close) . The reason why this calculus is considered a paradigm for asynchronous 
communication is that there is no primitive output prefix, hence no primitive no- 
tion of continuation after the execution of an output action. In other words, the 
process executing an output action will not be able to detect (in principle) when 
the corresponding input action is actually executed. 

2.2 Probabilistic Automata, Adversaries, and Executions 

Asynchronous automata have been proposed in m- We simplify here the original 
definition, and tailor it to what we need for defining the probabilistic extension 
of the asynchronous 7r-calculus. The main difference is that we consider only 
discrete probabilistic spaces, and that the concept of deadlock is simply a node 
with no out-transitions. 

A discrete probabilistic space is a pair (X,pb) where A is a set and pb is a 
function pb : X ^ (0, 1] such that J2xex pb{x) = 1. Given a set Y, we define 

Prob{Y) = {{X,pb) \ X f-Y and (X,pb) is a discrete probabilistic space}. 
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Table 1. The late-instantiation transition system of the asynchronous tt- 
calculus. 



Given a set of states S and a set of actions A, a probabilistic automaton on S 
and A is a triple {S, T, sq) where sq S 5 (initial state) and T C Sx Prob{A x S). 
We call the elements of T transition groups (in \12\ they are called steps). The 
idea behind this model is that the choice between two different groups is made 
nondeterministically and possibly controlled by an external agent, e.g. a sched- 
uler, while the transition within the same group is chosen probabilistically and it 
is controlled internally (e.g. by a probabilistic choice operator). An automaton 
in which at most one transition group is allowed for each state is called fully 
probabilistic. 

We define now the notion of execution of an automaton under a scheduler, 
by adapting and simplifying the corresponding notion given in m- A scheduler 
can be seen as a function which solves the nondeterminism of the automaton by 
selecting, at each moment of the computation, a transition group among all the 
ones allowed in the present state. Schedulers are sometimes called adversaries, 
thus conveying the idea of an external entity playing “against” the process. 
A process is robust wrt a certain class of adversaries if it gives the intended 
result for each possible scheduling imposed by an adversary in the class. Clearly, 
the reliability of an algorithm depends on how “smart” the adversaries of this 
class can be. We will assume that an adversary can decide the next transition 
group depending not only on the current state, but also on the whole history of 
the computation till that moment, including the random choices made by the 
automaton. 

Given a probabilistic automaton M = {S,T,sq), define tree{M) as the tree 
obtained by unfolding the transition system, i.e. the tree with a root no labeled 
by So, and such that, for each node n, if s G S' is the label of n, then for each 
(s, {X,pb)) G T, and for each {p,,s') G X, there is a node n' child of n labeled 
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by s', and the arc from n to n' is labeled by /i and pb{^, s'). We will denote by 
nodes{M) the set of nodes in tree{M), and by state{n) the state labeling a node 
n. 

An adversary for M is a function C that associates to each node n of tree{M) 
a transition group among those which are allowed in state (n). More formally, : 
nodes(M) Prob{A x S) such that C,{n) = (X,pb) implies {state{n), (X,pb)) S 

r. 

The execution tree of an automaton M = (S', T, sq) under an adversary 
C, denoted by etree{M,Q, is the tree obtained from tree{M) by pruning all 
the arcs corresponding to transitions which are not in the group selected by 
((. More formally, etree{M,Q is a fully probabilistic automaton {S' ,T' ,no), 
where S' C nodes{M), no is the root of tree{M), and {n,{X',pb')) € T' iff 
X' = {(/i,n') I {pL, state{n')) G X} and pb'{p,,n') = pb{p,, state{n')), where 
{X,pb) = C{n). 

An execution fragment ^ is any path (finite or infinite) from the root of 
etree{M,Cf). The notation ^ < f' means that ^ is a prefix of . If ^ is no 

Po 

n\ ri 2 . . ., the probability of f is defined as pb{^) = J([ pi. If ^ is maximal. 

Pi P2 

then it is called execution. We denote by exec{M, () the set of all executions in 
etree{M, f). 

We define now a probability on certain sets of executions, following a standard 
construction of Measure Theory. Given an execution fragment let G 

exec{M,() \ ^ < ^'} {cone with prefix f). Define pb{C\) = pb{f). Let {Ci}ig/ 
be a countable set of disjoint cones (i.e. / is countable, and Vz,j. i ^ j ^ 
CiCiCj = 0). Then define pb{[j^^J Ci) = pb{Ci). It is possible to show that 
pb is well defined, i.e. two countable sets of disjoint cones with the same union 
produce the same result for pb. We can also define the probability of an empty 
set of executions as 0, and the probability of the complement of a certain set of 
executions as the complement wrt 1 of the probability of the set. The closure 
of the cones wrt the empty set, the countable union, and the complementation 
generates what in Measure Theory is known as a cr-field. 



3 The Probabilistic Asynchronous 7r-Calculus 

In this section we introduce the probabilistic asynchronous 7r-calculus {iTpa- 
calculus for short) and we give its operational semantics in terms of probabilistic 
automata. 

The TTpa-calculus is obtained from the asynchronous 7r-calculus by replacing 
'^iCKi.Pi with the following probabilistic choice operator 

y^PiU^.P^ 

i 

where the pfs represent positive probabilities, i.e. they satisfy pi G (0, 1] and 
J2iPi — o;*’® input or silent prefixes. 
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In order to give the formal definition of the probabilistic model for TTpo, we 
find it convenient to introduce the following notation for representing transition 
groups: given a probabilistic automaton (S', T, sq) and s G S, we write 

s Si\i G 1} 

Pi 

iff (s, ({(/Ti, Si) \ i G I}, pb)) G T and \/i G I Pi = pb{pLi, Si), where I is an index 
set. When / is not relevant, we will use the simpler notation s 5^}^. We will 

Pi 

also use the notation s Si}i.^a), where is a logical formula depending 
on i, for the set s {-^ Si\i G I and </)(*)}• 

Pi 

The operational semantics of a iTpa process P is defined as a probabilistic 
automaton whose states are the processes reachable from P and the T relation 
is defined by the rules in Table [21 In order to keep the presentation simple, 
we impose the following restrictions: In SUM we assume that all branches are 
different, namely, if i yf j, then either ai yf ay, or Pi ^ Pj. Furthermore, in Res 
and Par we assume that all bound variables are distinct from each other, and 
from the free variables. 

The Sum rule models the behavior of a choice process. Note that all possi- 
ble transitions belong to the same group, meaning that the transition is chosen 
probabilistically by the process itself. Res models restriction on channel y. only 
the actions on channels different from y can be performed and possibly syn- 
chronize with an external process. The probability is redistributed among these 
actions. Par represents the interleaving of parallel processes. All the transitions 
of the processes involved are made possible, and they are kept separated in the 
original groups. In this way we model the fact that the selection of the process 
for the next computation step is determined by a scheduler. In fact, choosing a 
group corresponds to choosing a process. Com models communication by hand- 
shaking. The output action synchronizes with all matching input actions of a 
partner, with the same probability of the input action. The other possible tran- 
sitions of the partner are kept with the original probability as well. Close is 
analogous to Com, the only difference is that the name being transmitted is 
private to the sender. Open works in combination with Close like in the stan- 
dard (asynchronous) 7r-calculus. The other rules. Out and CONG, should be 
self-explanatory. 

Next example shows that the expansion law does not hold in tt^q. This should 
be no surprise, since the choices associated to the parallel operator and to the 
sum, in TTpa, have a different nature: the parallel operator gives rise to nondeter- 
ministic choices of the scheduler, while the sum gives rise to probabilistic choices 
of the process. 

Example 1. Let R\ = x{z).P \ y{z).Q and i ?2 = P x{z).{P \ y{z).Q) -f (1 — 
p) y{z) .{x{z) .P I Q). The transition groups starting from R\ are: 
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Table 2. The late-instantiation probabilistic transition system of the TTpa- 
calculus. 

On the other hand, there is only one transition group starting from R2, namely: 
R2{"-^P\y{z).Q , ^a:(z).P|Q} 

p i-p 

Figure [U illustrates the probabilistic automata corresponding to Ri and i?2- 

As announced in the introduction, the parallel operator is associative. This 
property can be easily shown by case analysis. 



Proposition 1. For every process P, Q and R, the probabilistic automata of 
P I (Q I i?) and of {P \ Q) | i? are isomorphic, in the sense that they differ only 
for the name of the states (i.e. the syntactic structure of the processes). 

We conclude this section with a discussion about the design choices of iTpa- 
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Ri R2 




Fig. 1. The probabilistic automata i?i and R2 of Example [H The transition 
groups from i?i are labeled by I and II respectively. The transition group from 
i?2 is labeled by I. 



3.1 The Rationale behind the Design of TTpa 

In defining the rules of the operational semantics of iTpa we felt there was only 
one natural choice, with the exception of the rules COM and Close. For them 
we could have given a different definition, with respect to which the parallel 
operator would still be associative. 

The alternative definition we had considered for Com was: 

^ 3*. and 

Com — 

I Q \ Qi}i-tJ.i=x{y) Pi = PiU2j-.fij=x(y)Pj 

and similarly for Close. 

The difference between Com and Com' is that the latter forces the process 
performing the input action (Q) to perform only those actions that are compat- 
ible with the output action of the partner (P). 

At first Com' seemed to be a reasonable rule. At a deeper analysis, however, 
we discovered that Com' imposes certain restrictions on the schedulers that, 
in a distributed setting, would be rather unnatural. In fact, the natural way of 
implementing the tTq communication in a distributed setting is by representing 
the input and the output partners as processes sharing a common channel. When 
the sender wishes to communicate, it puts a message in the channel. When the 
receiver wishes to communicate, it tests the channel to see if there is a message, 
and, in the positive case, it retrieves it. In case the receiver has a choice guarded 
by input actions on different channels, the scheduler can influence this choice by 
activating certain senders instead of others. However, if more than one sender 
has been activated, i.e. more than one channel contains data at the moment 
in which the receiver is activated, then it will be the receiver which decides 
internally which channel to select. Com models exactly this situation. Note that 
the scheduler can influence the choices of the receiver by selecting certain outputs 
to be premises in Com, and delaying the others by using Par. 
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With Com', on the other hand, when an input-guarded choice is executed, 
the choice of the channel is determined by the scheduler. Thus Com' models the 
assumption that the scheduler can only activate (at most) one sender before the 
next activation of a receiver. 

The following example illustrates the difference between COM and Com'. 

Example 2. Consider the processes P\ = xiy, P2 = X2Z, Q = 1/3 xi{y).Qi + 
2/3 X2{y)-Q2, and define R = {i'Xi){i'X2){Pi \ P2 \ Q)- Under Com, the transi- 
tion groups starting from R are 

R{^ Ri,^ R2} R{^Ri} R{^R2} 

1/3 2/3 1 1 

where Ri = {vxi){vx2){P2 \ Qi) and R2 = {vx\){vx2){Pi \ Q2)- The first group 
corresponds to the possibility that both x\ and X2 are available for input when 
Q is scheduled for execution. The other groups correspond to the availability of 
only x\ and only X2 respectively. 

Under Com', on the other hand, the only possible transition groups are 

R{^Ri) R{^R2} 

Note that, in both cases, the only possible transitions are those labeled with r, 
because X\ and X2 are restricted at the top level. 

4 Solving the Electoral Problem in iTpa 

In | 2 ] it has been proved that, in certain networks, it is not possible to solve 
the leader election problem by using the asynchronous 7 r-calculus. The problem 
consists in ensuring that all processes will reach an agreement (elect a leader) 
in finite time. One example of such network is the system consisting of two 
symmetric nodes Pq and P\ connected by two internal channels xq and Xi (see 

Figure 12 ) • 



Xo 




Fig. 2. A symmetric network P = lyxg i^xi(Pq | Pi). The restriction on xq, xi is 
made in order to enforce synchronization. 

In this section we will show that it is possible to solve the leader election 
problem for the above network by using the TTpa-calculus. Following [ 0 ], we will 
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assume that the processes communicate their decision to the “external word” by 
using channels oq and oi . 

The reason why this problem cannot be solved with the asynchronous tt- 
calculus is that a network with a leader is not symmetric, and the asynchronous 
TT-calculus is not able to force the initial symmetry to break. Suppose for example 
that Pq would elect itself as the leader after performing a certain sequence of 
actions. By symmetry, and because of lack of synchronous communication, the 
same actions may be performed by P\. Therefore P\ would elect itself as leader, 
which means that no agreement has been reached. 

We propose a solution based on the idea of breaking the symmetry by repeat- 
ing again and again certain random choices, until this goal has been achieved. 
The difficult point is to ensure that it will be achieved with probability 1 under 
every possible scheduler. 

Our algorithm works as follows. Each process performs an output on its 
channel and, in parallel, tries to perform an input on both channels. If it succeeds, 
then it declares itself to be the leader. If none of the processes succeeds, it is 
because both of them perform exactly one input (thus reciprocally preventing 
the other from performing the second input). This might occur because the 
inputs can be performed only sequentialljy. In this case, the processes have to 
try again. The algorithm is illustrated in Table [S] 



Pi = Xi{t) 

I recx( 1/2 T.Xi{b). if b 

then ( (1 - e) a;iei(b).(bi(i) | Xi{f)) 



+ 



+ 

£ T.{Xi{t) I X)) 
else di{i © 1 ) ) 



1/2 r.®iei(6). if b 

then ( (1 - e) Xi(b).(di{i) | *iei(/)) 



+ 

e T.(xi(Bi{t} I X)) 
else di{i © 1 ) ) 

Table 3. A TTpa solution for the electoral problem in the symmetric network of 

Figure [21 Here i G {0, 1} and © is the sum modulo 2. 

^ In the TTpa-calculi and in most process algebra there is no primitive for simultaneous 
input action. Nestmann has proposed in jzj the addition of such construct as a way 
of enhancing the expressive power of the asynchronous rr-calculus. Clearly, with this 
addition, the solution to the electoral problem would be immediate. 
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In the algorithm, the selection of the first input is controlled by each process 
with a probabilistic blind choice, i.e. a choice whose branches are prefixed by 
a silent (r) action. This means that the process commits to the choice of the 
channel before knowing whether it is available. It can be proved that this com- 
mitment is essential for ensuring that the leader will be elected with probability 
1 under every possible adversary scheduler. The distribution of the probabilities, 
on the contrary, is not essential. This distribution however affects the efficiency 
(i.e. how soon the synchronization protocol converges). It can be shown that it 
is better to split the probability as evenly as possible (hence 1/2 and 1/2). 

After the first input is performed, a process tries to perform the second input. 
What we would need at this point is a priority choice^ i.e. a construct that 
selects the first branch if the prefix is enabled, and selects the second branch 
otherwise. With this construct the process would perform the input on the other 
channel when it is available, and backtrack to the initial situation otherwise. 
Since such construct does not exists in the 7r-calculi, we use probabilities as a 
way of approximating it. Thus we do not guarantee that the first branch will 
be selected for sure when the prefix is enabled, but we guarantee that it will be 
selected with probability close to 1 : the symbol e represents a very small positive 
number. Of course, the smallest e is, the more efficient the algorithm is. 

When a process, say Pq, succeeds to perform both inputs, then it declares 
itself to be the leader. It also notifies this decision to the other process. For the 
notification we could use a different channel, or we may use the same channel, 
provided that we have a way to communicate that the output on such chan- 
nel has now a different meaning. We follow this second approach, and we use 
boolean values t and f for messages. We stipulate that t means that the leader 
has not been decided yet, while f means that it has been decided. Notice that 
the symmetry is broken exactly when one process succeeds in performing both 
inputs. 

In the algorithm we make use of the if-then-else construct, which is defined 
by the structural rules 

if t then P else Q = P if { then P else Q = Q 

As discussed in |8], these features (booleans and if-then-else) can be translated 
into the asynchronous 7r-calculus, and therefore in TTpa- 

Next theorem states that the algorithm is correct, namely that the probability 
that a leader is eventually elected is 1 under every scheduler. Due to space 
limitations we omit the proof; the interested reader can find it in [^ . 

Theorem 1. Consider the process vxq vx\{Pq \ Pi) and the algorithm of table 
El The probability that the leader is eventually elected is 1 under every adversary. 

We conclude this section with the observation that, if we modify the blind 
choice to be a choice prefixed with the input actions which come immediately 
afterward, then the above theorem would not hold anymore. In fact, we can de- 
fine a scheduler which selects the processes in alternation, and which suspends 
a process, and activates the other, immediately after the first has made a ran- 
dom choice and performed an input. The latter will be forced (because of the 
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guarded choice) to perform the input on the other channel. Then the scheduler 
will proceed with the first process, which at this point can only backtrack. Then 
it will schedule the second process again, which will also be forced to backtrack, 
and so on. Since all the choices of the processes are obligate in this scheme, the 
scheduler will produce an infinite (unsuccessful) execution with probability 1. 

5 Implementation of TVp^ in a Java-like Language 

In this section we propose an implementation of the synchronization- closed iTpa- 
calculus, namely the subset of Wpa consisting of processes in which all occurrences 
of communication actions x{y) and xy are under the scope of a restriction oper- 
ator vx. This means that all communication actions are forced to synchronize. 

The implementation is written in a Java-like language following the idea 
outlined in Section IrTTI It is compositional wrt all the operators, and distributed, 
i.e. homomorphic wrt the parallel operator. 

Channels are implemented as one-position buffers, namely as objects of the 
following class: 

class Channel { 

Channel message; 
boolean isEmpty; 

public void Channel () { 
isEmpty = true; 

} 

public synchronized void send (Channel y) { 
while (! isEmpty) waitO; 
isEmpty = false; 
message = y; 
notifyAllO ; 

} 

public synchronized GuardState test_and_receive() { 
GuardState s = new GuardState 0; 
if (! isEmpty) { s.test = true; 

s. value = message; 
isEmpty = true; 
return s ; } 
else { s.test = false; 

s . value = null ; 
return s ; } 

} 

} 

class GuardState { 

public boolean test; 
public Channel value; 



} 
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The methods send and test_and_receive are used for implementing the 
output and the input actions respectively. They are both synchronized, because 
the test for the emptyness (resp. non-emptyness) of the channel, and the subse- 
quent placement (resp. removal) of a datum, must be done atomically. 

Note that, in principle, the receive method could have been defined dually to 
the send method, i.e. read and remove a datum if present, and suspend (wait) 
otherwise. This definition would work for input prefixes which are not in the 
context of a choice. However, it does not work for input guarded choice. In 
order to simulate correctly the behavior of the input guarded choice, in fact, we 
should check continuously for input events, until we find one which is enabled. 
Suspending when one of the input guards is not enabled would be incorrect. Our 
definition of test_and_receive circumvent this problem by reporting a failure 
to the caller, instead of suspending it. 

Given the above representation of channels, the TTpa-calculus can be imple- 
mented by using the following encoding [(•)]: 



Probabilistic Choice 

m n 

l'^PiXi(y).Pi+ ^ p,T.Pi)] = 

i—1 

{ boolean choice = false; 

GuardState s = new GuardStateO ; 
float x; 

Random gen = new Random 0 ; 
while ( ! choice) { 

X = 1 - gen.nextFloatO ; "/ nextFloatO returns a real in [0,1) 



if (0 < x <= Pi ) 

{ s = xl . test_and_receive O ; 

if (s.test) { y = s. value; [(Pi)] 
choice = true; } 



} 



if (pi -|-P2 -f ... -fPm-l < X <= Pi -|-P2 -f ... -I-Pm) 
{ s = xm.test_and_receive() ; 

if (s.test) { y = s. value; [(Pm)] 
choice = true; } 

} 

if (pi -|-P2 -f ... -f Pm ^ X <— p\ -\- P2 “h ••• “h Pm + l) 

f IPrn + l)] 

choice = true; } 



if (pi +P2 -f ... -fPn-l < X <= Pi +P2 + ... +Pn) 

f IPn} 

choice = true; } 
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Note that with this implementation, when no input guards are enabled, the 
process keeps performing internal (silent) actions instead of suspending. 



Output Action 

[(^2/)] = { x.send(y); } 



Restriction 

^vxP'^ = { Channel x = new Channel (); [( )] } 

Parallel If our language is provided with a parallel operator, then we can just 
have a homomorphic mapping: 

[(Pl|P2)] = [(Pl)]|[(P2)] 

In Java, however, there is no parallel operator. In order to mimic it, a possibility 
is to define a new class for each process we wish to compose in parallel, and then 
create and start an object of that class: 

class processPl extends Thread { 
public void run() { 

[(A)] 

} 

} 

[(Pi I P 2 )] = { new processPl. start () ; [( P 2 )] } 



Recursion Remember that the process recxP represents a process X defined 
as X P, where P may contain occurrences of X . For each such process, define 
the following class: 

class X { 

static public void execO { 

IP)] 

} 

} 

Then define: 



[recxP)]={ X.execO; } 
[( A )] = { X.execO ; } 
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6 Conclusion and Future Work 

We have defined a probabilistic extension Tr^a of the asynchronous 7r-calculus 
based on the model of probabilistic automata. The main novelty is the intro- 
duction of a probabilistic choice operator. The parallel operator is still modeled 
nondeterministically, the idea being that it is controlled by an external sched- 
uler. We have argued that our calculus is more powerful than the asynchronous 
TT-calculus by showing that it is able to express the solution to the electoral 
problem in a symmetric network. 

Future work include the embedding of the 7r-calculus into iTpa and the devel- 
opment of a proof system for properties of TTpa programs. 
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Abstract. A new treatment of data refinement in typed lambda calculus 
is proposed, phrased in terms of pre-logical relations IHS99I rather than 
logical relations, and incorporating a constructive element. Constructive 
data refinement is shown to have desirable properties, and a substantial 
example of refinement is presented. 



1 Introduction 

One of the activities involved in developing programs from specifications is the 
transformation of “abstract programs” involving types of data that are not nor- 
mally available as primitive in programming languages (graphs, sets, etc.) into 
“concrete programs” in which a representation of these in terms of simpler types 
of data is provided. Apart from the change to data representation, such data 
refinement should have no effect on the results computed by the program: the 
concrete program should be equivalent to the abstract program in the sense that 
all computational observations should return the same results in both cases. 

The standard treatment of data refinement in the context of typed lambda 
calculus, originating with Reynolds in [Rey81[ |Rey83| but described most clearly 
in | Ten94| . cf. Sect. 8.5 of | Mit96| . uses logical relations to prove the correctness 
of refinements. This work has its roots in |Hoa72| . which proposes that the cor- 
rectness of the concrete program be verified using an invariant on the domain of 
concrete values together with a function mapping concrete values (that satisfy 
the invariant) to abstract values. In algebraic terms, what is required is a homo- 
morphism from a subalgebra of the concrete algebra to the abstract algebra. A 
strictly more general method is to take a homomorphic relation (a so-called cor- 
respondence jSch90] . cf. |Mil71] l in place of a homomorphism from a subalgebra. 
Logical relations extend these ideas to deal with higher-order functions. 

Proof method ( \Ten9f'^ ). Let A and B be S-Henkin models and let OBS, the 
observable types, be a subset of Types (S). To show that B is a refinement of 
A, find a logical relation TZ over A and B such that is the identity relation 
for each a G OBS . We then say that B is a logical refinement of A and write 
A K, or A^B when we want to make TZ explicit. 

J. Tiuryn (Ed.): FOSSACS 2000, LNCS 1784, pp. lOl- fTTUl 2000. 

(c) Springer- Verlag Berlin Heidelberg 2000 
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It is well-known that the composition of two logical relations is not in general 
a logical relation. It follows that, given logical refinements B and the 

composition S o TZ cannot in general be used as a witness for the composed 
refinement A C. (In fact, the problem is more serious than it appears at first: 
sometimes there is no witness for A C at all, see Sect. m This is at odds 
with the stepwise nature of refinement, and the transitivity of the underlying 
concept of refinement expressed in terms of observational equivalence. It is one 
source of examples demonstrating the incompleteness of the above proof method; 
there are other examples that do not involve composition of refinement steps, 
see e.g. Sect. The proof method is complete in the absence of higher-order 
term constants IMit96l . 

In |HS99| . a weakening of the notion of logical relations called pre-logical re- 
lations was studied; cf. [PPSTOO] . Pre-logical relations are closed under compo- 
sition; in fact, they are the minimal weakening of logical relations with this prop- 
erty. They completely characterize observational equivalence of Henkin models, 
without restriction to first-order signatures. Replacing logical relations with pre- 
logical relations in the above gives a notion of pre-logical refinement which is in 
pleasing harmony with stepwise refinement. Indeed, it is equivalent to the un- 
derlying concept of refinement, i.e. sound and complete as a proof method. 

This is an improvement but pre-logical refinement still does not entirely ac- 
cord with our intuition concerning data refinement and stepwise development 
of programs. For one thing, like logical refinement it is a symmetric relation. 
We will consider a more elaborate notion of data refinement, called constructive 
pre-logical refinement (Sect. 01). This is a relation between specifications, writ- 
ten SP SP' , which incorporates a construction in the form of a derived 

signature morphism S taking models of SP' to Henkin models over the signa- 
ture of SP. Derived signature morphisms define the types and constants in one 
signature by giving terms over another signature, and this corresponds directly 
to the code in an ML functor body. It follows that the result of a complete 
chain of constructive refinements is a Henkin model, corresponding to a modu- 
lar ML program, which is a solution to the original programming task. We give 
an extended example of constructive data refinement in the context of exact real 
number computation, and show that it is not a (constructive) logical refinement 
(Sect. E]). 

Some recent accounts of data refinement in typed lambda calculus have em- 
ployed variants of logical relations that are related to pre-logical relations, for 
instance [IKUFTT97] . Our inclusion of a constructive element in the relation is 
new, and our example appears to be the first non-trivial concrete example of 
data refinement in the lambda calculus literature. 

The idea of constructive pre-logical refinement comes from the world of al- 
gebraic specifications, where it is called abstractor implementation [ST88| or 
behavioural implementation [ST97] . This paper is an attempt to explain this 
idea in lambda calculus terms, since it is a substantial improvement on current 
accounts of data refinement in that context. One novelty with respect to exist- 
ing work on abstractor implementations concerns the connection with pre-logical 
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relations, which generalizes Schoett’s characterization of observational equiva- 
lence via correspondences and makes a bridge with work on data refinement in 
lambda calculus based on logical relations. Another novelty concerns the use 
of derived signature morphisms in the typed lambda calculus for defining con- 
structions. In order for abstractor implementations to compose, constructions 
are required to preserve observational equivalence, a property known as stability 
ISch87l . This requirement is normally imposed as an assumption on the lan- 
guage used for defining constructions, which is left unspecified. Here, stability 
follows easily from the Basic Lemma of pre-logical relations. Finally, the example 
in Sect. |^goes considerably beyond the simple examples of refinement of data 
representation that have been considered previously. 

2 Preliminaries: Syntax and Semantics 

For the sake of simplicity of the exposition we restrict attention to A^, the 
simply-typed lambda calculus having — > as its only type constructor. 

Definition 2.1. The set of types over a set B of base types (or type constants ) 
is given by the grammar a ::= 5 | tr — > cr where b ranges over B. A signature E 
consists of a set B of type constants and a collection C of typed term constants 
c : a. Types(E) denotes the set of types over B. 

In a E-context B = xi:ai, . . . , a;„:cr„, we require that Xi yf xj for all 1 < i < 
j < n and ai € Types {E) for all 1 < f < n. E -terms are given by the grammar 
M ::= X I c | Xx'.a.M \ MM where x ranges over variables and c over term 
constants. The usual typing rules associate each well-formed term M in context 
r with a type a € Types (E), written T \> M : a (or T AI : a when we need 
to make E explicit). If T is empty then we write simply M : a or \>^M : a. 

Definition 2.2. A A7-Henkin model A consists of: 

— a carrier set |cr]^ for each a G Types (E); 

— a function App’^ : \a r]^ — s- |cr]'^ ^ |r]"^ for each a,T € Types(E); 

— an element |c]^ G each term constant c : a in E; and 

— elements G |cr ^ (r — > cr)]^ and G |(p a ^ t) ^ {p ^ a) ^ 

P ''’1^ 6ac/i p,a,T £ Types{E) 

such that 

— X y = x and x y z = {x z){y z); and 

— ( extensionality) if App(^ f x = App(f^ g x for every x G , then f = g. 
The class of all E-Henkin models is denoted Mod{E). 

Term constants of functional type are interpreted as total functions. Allowing 
partial functions does not seem to introduce problems, but we have not checked 
the details. Moreover, the use of Henkin models in this paper is not essential; 
the definitions and results in [HS99J that we will need later also apply to non- 
extensional models, for instance combinatory algebras. 
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A r -environment 77 on a Henkin model A assigns elements of A to variables, 
with ri{x) G for a; : cr in _T. A A-term F l> M : a is interpreted in A under a 
A-environment 77 in the usual way with A-abstraction interpreted via translation 
to combinators, written |A > M : cr];^, and this is an element of |cr]-^. If M is 
closed then we write simply |M : cr]'^. 

We can allow terms to contain the fixed-point combinator Y (viewed as 
a term constant). To interpret such terms in a Henkin model A, we need to 
additionally require elements YJ e |(cr — > cr) — > cr]-^ for each ct G Types{Y) such 
that / {YJ /) = YJ f. We will assume that this additional structure is present 
whenever we consider such terms. 

Definition 2.3. A logical relation TZ over S -Henkin models A and B is a family 
of relations C x lcr}'^}a-^Types{E) such that: 

- R^^'^{f,g) iff^ae |cr]-^.V5 G (a,b) R"'{App^ f a, App;^ g b) . 

— |c]®) for every term constant c : a in S. 



Definition 2.4. If F M : a and F M' : a then ^F.M M' is a S- 
equation. The subscript a is omitted when it is obvious. A S-Henkin model A 
satisfies a S-equation WF.M =„ M' if \F ^ M : = |T > M' : cr];)^ for all 

F -environments 77. It is easy to add connectives and quantifiers, giving sentences 
of predicate logic with equality. A specification SP consists of a signature E and 
a set of E -sentences. Then Sig{SP) = E, and Mod(SP) (the models of SP) 
is the class of all E-Henkin models satisfying all the sentences in <P. 

3 Data Refinement 

We begin with an analysis of the failure of composition of logical relations and 
its impact on composition of logical refinements. 

Example 3.L Let E contain two type constants, b and 6', and no term constants. 
Consider A-Henkin models A, B, C which interpret b and F as follows, and inter- 
pret function types using full set-theoretic function spaces: |&]-^ = {*} = 

|6]® = {*} E^nd |6']® = {o,*}; |6]‘' = {o,*} = . Let IZ be the logical re- 
lation over A and B induced by Pf = {(*,*)} and = {(*, °), (*, •)} and 

let S be the logical relation over B and C induced by = {(*, °), (*, •)} and 

= {(o, o), (•, •)}. 5 o 7^ is not a logical relation because it does not relate 
the identity function in |6]-^ ^ to the identity function in |6]‘' ^ 1^1^ • 

The problem is that the only two functions in |6]® — > |6']® are {* 1— > 0} and 
{* 1-^ •}, and S does not relate these to the identity in C. □ 

This simple example shows that we may have logical refinements A^ B and 
B^C (where we take OBS = 0 in both cases), where 5 o 7^ is not a logical 
relation and so cannot act as witness to A C. 

One possible solution might be to construct the relations at higher types from 
the composite relations at base types. This works if A contains only first-order 
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term constants (guaranteeing that the restriction to base types lifts to a logical 
relation) and if OBS contains only base types (guaranteeing that the resulting 
logical relation is the identity relation for each a G OBS). The following example 
shows how this idea fails in the presence of second-order term constants. 

Example 3. 2. In the previous example, add a type constant bool and a term con- 
stant c : {b ^ b') ^ bool to E. Let |&ooZ]-^ = |&ooZ]® = l&ooO*' = {true, false} 
and take and to be the identity. In each model, let the interpretation 

of c take constant functions to true and all other functions to false. The resulting 
TZ and S are logical relations. As before, S oTZ is not a logical relation but now 
the restriction of 5 o 7?. to base types cannot be lifted to a logical relation either: 
this would relate the identity function in |6]^ ^ [^1^ (which is a constant 
function) to every function in |6]^ ^ [^1^, but then the constant function in 
A would be related to non-constant functions in C, and so |c]^ could not be 
related to |c]^, otherwise true would be related to false. □ 

These two examples show that certain ways of composing the logical relations 
witnessing A B and B C do not yield a logical relation witnessing A-^ C. 
Such a witness may exist, however, and in the above example it does. For OBS = 
0, the full relation suffices; for OBS = {bool}, the full relation on b together with 
the empty relation on b' and the identity on bool lifts to a logical relation. But 
if we add constants 51,62 : 6 and 61', 52' : 6' with |61]^ = |62]-'^ = |61']^ = 
|62']^ = *, |61]‘' = |61']‘' = o and |62]^ = |62']^ = • then there is no logical 
relation over A and C which is the identity on bool so A C for OBS = {bool}. 
The following proposition summarizes the situation. 

Proposition 3.3. A-^ B and B C does not in general imply A-^ C. □ 

Ultimately, the justification for the definition of logical refinement lies in the 
notion of observational equivalence, in terms of which the underlying concept of 
data refinement is formulated. 

Definition 3.4. Let A and B be E-Henkin models and let OBS C Types{E). 
Then A is observationally equivalent to B with respect to OBS, written A =obs 
B, if for any two closed E -terms M,N : a for a G OBS, |M : tr]-^ = |7V : tr]-^ 

It is usual to take OBS to be the “built-in” types for which equality is decid- 
able, for instance bool and/or nat. Then A and B are observationally equivalent 
iff it is not possible to distinguish between them by performing computational 
experiments. Note that OBS C OBS' implies =obs 2 =obs- 

For OBS = {nat}, the connection between logical refinement and observa- 
tional equivalence is given by Mitchell’s representation independence theorem. 



Theorem 3.5 ( [Mit96] I. Let E be a signature that includes a type constant 
nat, and let A and B be E-Henkin models, with |nat]^ = = N. Lf there 

is a logical relation TZ over A and B with the identity relation on natural 
numbers, then A ={nat} B. Conversely, if A ={„at} B, E provides a closed term 
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for each element ofN, and E contains only first-order term constants, then there 
is a logical relation TZ over A and B with the identity relation. □ 

The restriction to signatures with first-order term constants in the second part 
of the theorem is necessary, and this is the key to the incompleteness of logical 
refinements as a proof method and the problem with composability of logical 
refinements. If A B C then A =obs B =obs C, and so A =obs C 
since =obs is an equivalence relation. But then it follows that A-^ C only for 
signatures without higher-order term constants. 

An improved version of the above theorem, without the restriction to first- 
order signatures, holds if logical relations are replaced by pre-logical relations. 



Definition 3.6 (| pHS99 |f . A pre-logical relation TZ over E-Henkin models A 
and B is a family of relations C |tr]^ x |cr]®}£,g Tj/pes(i:) such that: 

— If R^^'^{f,g) thenVae |cr]^.V 6 G .R'^{a,b) ^ R^{App^ f a, App^ gb). 

— i?‘^(|c]^, |c]®) for every term constant c : a in E. 

— and R{K^^ , for all p,a,r G Types{E). 

Theorem 3.7 (jHS99j). Let A and B he E-Henkin models and let OBS C 
Types{E). Then A =obs B iff there exists a pre-logical relation over A and B 
which is a partial injection on OBS . □ 

This suggests the following. (We switch to a notation that makes the set of 
observable types explicit.) 

Definition 3.8. Let A and B he E-Henkin models and OBS C Types (E). Then 
B is a pre-logical refinement of A, written A if there is a pre-logical 

relation TZ over A and B such that R'^ is a partial injection for each a G OBS . 

We phrase this as a definition, rather than as a proof method for the underlying 
notion of data refinement, in contrast to logical refinements. As a proof method 
it is sound and complete, and therefore equivalent to this underlying notion. 

Pre-logical relations compose [HS99] . so pre-logical refinements compose, and 
this explains why stepwise refinement is sound. Another explanation goes via 
Theorem 13.71 A B C A =obs B =obs C ^ A =obs C 
A The set of observable types need not be the same in both steps, as 

the following result spells out. 

Proposition 3.9. If A'3§^.s> B andB'OSS^C then A'^S^.^C provided OBS 
OBS'. □ 

The definition of observational equivalence may be extended to allow ex- 
periments to include the fixed-point combinator by requiring Henkin models to 
include elements G |(cr ^ cr) ^ cr]^ for each a G Types{E) as indicated 
above. Theorem 1 3 . 7 1 still holds provided pre-logical relations are required to re- 
late T 4 with Tg for all a. 
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4 Constructive Data Refinement 



Pre-logical refinement, like logical refinement, is a symmetric relation. This does 
not fit the intuition that refinement is about going from an abstract, high-level 
description of a program to a concrete, more detailed description. There are 
at least two basic defects of the notion of pre-logical refinement of which the 
symmetry of the relation is merely a symptom. 

First, it is a relation between Henkin models. The intuition behind stepwise 
refinement suggests that it should be rather a relation between descriptions of 
Henkin models, i.e. between specifications. The original specification of a prob- 
lem rarely determines a single permissible behaviour: some of the details of 
the behaviour are normally left open to the implementor. So at this stage one 
starts with an assortment of models, corresponding to all the possible choices of 
behaviours. (Some of these will be isomorphic to one another, given a suitable 
notion of isomorphism, but if the specification permits more than one externally- 
visible behaviour then there will be non-isomorphic models.) The final program, 
on the other hand, corresponds to a single Henkin model. So the refinement 
process involves not just replacement of abstract data representations by more 
concrete ones, but also selection between permitted behaviours. 

Definition 4.1. Let SP and SP' he specifications with S = Sig{SP) = Sig(SP') 
and OBS C Types{E). Then SP' is a pre-logical refinement of SP, written 
SP SP' , if for any B € Mod(SP') there is some A € Mod(SP) with a 

pre-logical relation TZ over A and B such that is a partial injection for each 
a € OBS. 

Second, the idea that refinement is a reduction of one as-yet-unsolved problem 
to another is not explicit. Intuitively, each refinement step reduces the current 
problem to a smaller problem, such that any solution to the smaller problem 
gives rise to a solution to the original problem. In pre-logical refinement of spec- 
ifications, one models this by having the successive specifications accumulate 
more and more details arising from successive design decisions. Some parts be- 
come fully determined, and remain unchanged as a part of the specification until 
the final program is obtained. The parts that are not yet fully determined corre- 
spond to the unsolved parts of the original problem. (To avoid clutter, we omit 
the OBS decorations in the following diagrams.) 




It is much cleaner to separate the finished parts from the specification, proceeding 
with the development of the unresolved parts only, giving 
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where DONE is a specification having a ready realisation (e.g. a specification of 
the built-ins of the programming language in use). The finished parts k\, . . . , Kn 
are constructions extending any solution (model) of a reduced problem (specifi- 
cation) to a solution of the previous problem, and so we will refer to this relation 
as constructive data refinement. The signatures of successive specifications may 
be different, in contrast to the earlier refinement relations. 

Constructive data refinement will be defined below. As constructions we will 
take “(5-reducts” of A'-Henkin models induced by “derived signature morphisms” 
6 : E ^ E' , where E and E' are the signatures before and after refinement, 
respectively. This amounts to giving an interpretation of the type constants and 
term constants in E as types and terms over E' . 

Definition 4.2. Let E and E' be signatures. A derived signature morphism 
6 : E E' consists of: 

— a mapping from base types in E to types over E' : for every base type b in 
E, S(b) G Types(E'). This induces a mapping (also called S) from Types{E) 
to Types(E'), using S(cr —>■ t) = S(cr) <5('t). 

— a type-preserving mapping from term constants in E to closed terms over 
E' : for every c : a in E, >i;'(5(c) : S{a). 

This induces a mapping (also called 6) from terms over E to terms over E' , 
using S(x) = x, S{Xx:a.M) = Xx:S{a).S{M), 5{M M') = 5{M) 5{M'), and (if we 
are using the Y combinator) 6{Y) = Y. Composition is obvious. 

Proposition 4.3. If 6 : E ^ E' and T M : a then 6{T) S{M) : 6{a) 
where 6 {xi:(Ti, . . . ,cc„:(t„) = Xi\5{ai), . . . , a;„:(5(cr„). □ 

A derived signature morphism corresponds exactly to a functor in ML terminol- 
ogy, or a parameterised program |Gog84| : the functor parameter is a if'-Henkin 
model, and the functor body contains code which defines the components of E 
using the components of E' . If the fixed-point combinator is available then this 
code may involve recursive functions. (Recursively-defined types are not allowed 
since we are working in A“*, but see Sect. El) 

The semantics of these programs as functions on Henkin models is given by 
the notion of (5-reduct. 

Definition 4.4. Let 6 : E ^ E' and let Jf be a E' -Henkin model. The (5-reduct 
of A' is the E-Henkin model A!\s defined as follows: 

— |cr]^ I'* = |(5 ((t)]'^ for each a G Types{E); 

— App'(fJ^^ = for each a,r G Types{E); 
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— |c]^ I'* = |^(c)]^ for each term constant c : a in S; and 

— and (if we are using the Y combi- 

nator) Y^,^^ = for each p,a,r G Types{S). 

Proposition 4.5. {F \>s M : aj^ = {S{r) \>s' S{M) : S{a)j:^' . □ 

i7-Henkin models and pre-logical relations between such models form a cat- 
egory Mod (if). The following property is intimately related to the concept of 
stability in [Sch87j . 

Proposition 4.6 (Stability). For any S : S ^ S' , the mapping extends to 
a functor : Mod(if') — s- Mod(T'). If a pre-logical relation TZ' in Mod(if') 
is a partial injection on OBS' C Types{S') , then 'R!\s is a partial injection on 
5~^{OBS'). Thus A! =obs' implies A!\s =obs B'\s for any OBS C Types{E) 
such that 6{OBS) C OBS' . 

Proof. Take It follows from the Basic Lemma for pre-logical 

relations (see \HS9H^ ) that this yields a pre-logical relation. □ 

Now we are ready to give a formal definition of constructive data refinement. 

Definition 4.7. Let SB and SP' be specifications, 6 : Sig{SP) Sig(SP') be 
a derived signature morphism, and let OBS C Types{Sig{SP)^. Then SP' is a 
constructive pre-logical refinement of SP via 6, written SP SP , if for any 

B C Mod(SP') there is some A G Mod{SP) with a pre-logical relation TZ over A 
and B\s such that i?°' is a partial injection for each a G OBS. 

It is easy to modify this definition to give a notion of constructive logical re- 
finement, written . The correspondence between derived signature morphisms 
s 

as defined above and ML functors justifies the use of the word “constructive” . 
In Sect. 0 below we give an example of constructive pre-logical refinement and 
show that it is not a constructive logical refinement. 

Constructive pre-logical refinements compose via the composition of their 
underlying derived signature morphisms: 

Proposition 4.8. If SP SP' and SP' SP" then SP SP" 

provided 6{OBS) C OBS' . □ 

The required relationship between OBS and OBS' is just what one would expect: 
as refinement progresses, the successive specifications become increasingly less 
abstract and so the number of non-observable types tends to decrease, while the 
overall task of implementing SP with observable types OBS remains the same. 

As suggested above, a chain of constructive refinements is complete when 
the original problem has been reduced to a specification DONE with a given 
(implemented) model D\ 

SPo SPi SPn = DONE 

62 5 -n 



170 



Furio Honsell et al. 



Then, by Prop. 14. 8L if the condition on OBS \, . . . , OBSn is satisfied, DONE is 
a constructive pre-logical refinement of SPq via o • • • o (52 o <5i with respect 
to OBS\\ the Henkin model V\s^o---oS 2 .oSi is observationally equivalent to some 
model of SPq with respect to OBS\. In other words, c5„o - • - o(52o5i is a program 
that is a solution to the original programming task. 

5 An Example from Real Number Computation 

We now present an extended example of constructive data refinement in the 
context of exact real number computation. The point of this example is that the 
desired refinement can be expressed in terms of pre-logical relations, but not in 
terms of logical relations. 

We will describe a specification SP involving real numbers and some opera- 
tions on them, and a specification SP' which provides a means of implementing 
SP using higher- type functions. We will then present a constructive pre-logical 
refinement SP 'Q§§^ SP' that captures this implementation; however, we will 
show that there is no constructive logical refinement SP SP' . 

5.1 A Specification for Real Number Operations 

The specification SP has an underlying signature E consisting of the type con- 
stants real and bool and the following term constants: 

0, 1 : real sitP[o 2 ] : {real real) —>■ real 

— : real real true, false, 1- : bool 

-k, *, max : real real real < : real real bool 

We declare bool (only) to be an observable type. As usual, we treat -k, * and < 
as infixes. One could of course consider richer signatures (e.g. with division), but 
the signature above has the technical advantage that all the above operations are 
total functions in the intended models (see below regarding the interpretation 
of SMP[ 0 , 1 ])- 

A class of intended models for SP may be given via some logical axioms, as 
follows. For 0, 1, — , +, *, we take the usual axioms for a field; we also add axioms 
saying that the type real is totally ordered by <, where t < u abbreviates the 
logical formula 3z:real.u = t + {z* z). For max and sitp[g we add the axioms 

Va;, y.real. (x < y =k max xy = y)/\{y<x^ max x y = x) 

V/ : real real. {3z:real. \/x:real. 0<a;Aa:<l=k f{x) < z) =k 

{\/z:real. sup^Q^-^f < z <tk> \/x:real. 0<xAa:<l=k f{x) < z) 
An important logical consequence of these axioms (which we shall use later) is 
the formula sup^Q ^{Xx:real.O) = 0. 

The language we have defined is surprisingly expressive. For instance, every 
algebraic real number is definable by a closed term, and so any model for SP 
must contain a copy of at least the algebraic reals. In fact, the models we have in 
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mind contain all the computable or recursive reals (though not every computable 
real is definable by a closed term) . 

The only purpose of including the type bool in SP is to allow us to make 
observations on real numbers. In general we do not expect to be able to tell 
when two real numbers are the same, but we can tell when they are different. 
It suffices for our purposes to include the order relation < in our signature. The 
axioms for < are: 

\/x, y.real. {-^y < x) ^ x < y = true 
\/x, y.real. {-^x < y) ^ x < y = false 
\/x, y.real. x = y^x<y = l. 

This completes the definition of SP. 

Some brief remarks on models for SP may be helpful. The full set-theoretic 
type structure over M gives a model of SP, though we need to assign arbitrary 
values to the interpretation of sup^Q on functions / : K — > K which are un- 
bounded on [0, 1]. There are also natural models in which the interpretation of 
real real is constrained to include only continuous functions (see e.g. |Nor98l l. 

5.2 A Specification for PCF Computations 

We now present a specification SP' corresponding to the familiar functional 
language PCF [Plo77] . A constructive refinement SP SP' for OBS = 
{bool} then amounts to a way of implementing SP in PCF via a “program” S. 
The signature for SP' will consist of the single type constant nat and: 

0 : nat ifzero : nat nat nat nat 

succ,pred : nat — > nat \ {a ^ a) ^ a (cr S Types (S')) 

This is exactly the language for (a version of) PCF. The intention is that nat 
stands for the lifted natural numbers, with the term T = Y'^°‘*{Xz:nat.z) denot- 
ing the bottom element. We freely employ syntactic sugar in PCF terms where 
the meaning is evident. 

We now wish to add axioms to ensure that any model for SP' is a model of 
PCF in some reasonable sense. We do not know whether all the axioms below 
are strictly necessary for our purposes, but they correspond to a well-understood 
class of models of PCF. Let us write t | as an abbreviation for the formula 
ifzero t 0 0 = 0 (we may read this as “t terminates”). First we have an axiom 
saying there is only one non-terminating element of type nat: 

Mx:nat. ~^{x |) a; = T 

For 0 and succ, we take the usual first-order Peano axioms for the terminating 
elements. For the remaining constants, we take the axioms 

pred 0 = 0 \/x:nat. pred{succ x) = x 

Vy, z:nat. ifzero 0 y z = y \/x, y, z:nat. ifzero{succ x) y z = z 

Vy, z:nat. ifzero ± y z = Y 

\/f:a^a. Y^ f = f{Y^f) V/ : a ^ a, z = f z^Y^fQ^z 

where t\Y„ u abbreviates VP : cr ^ nat. (P 1 1) => (P u |). 
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Note that the full set-theoretic type structure over N_l is not a model, because 
not every set-theoretic function N_l ^ N_l has a fixed point. However, the usual 
Scott model based on CPOs (see [Plo77 | ) and the game models of e.g. [A.TM96] 
do provide models of SP', as do their recursive analogues. The extensional closed 
term model of PCF also provides a model (which in fact is isomorphic to the 
recursive game models). 

5.3 A Constructive Pre-logical Refinement 

We now describe a constructive refinement from SP to SP' . The basic idea is 
that we will represent a real number r by an infinite sequence d — dfjdid^ ■ ■ ■ oi 
natural numbers, which in turn is represented by the function / : Nj_ — > Nj_ given 
by /(*) = di- (More generally: in any model B of SP' , including non-standard 
ones, there will be an inclusion from Nj_ to for simplicity of notation 

we take Nj_ C |nat]®. Then we represent d by any function / G [nat — > nat]® 
such that f{i) = di for all i G N.) Operations on reals are then represented 
by higher- type operations on such functions. There are many ways to choose a 
suitable representation, and the differences between them do not matter much. 
For definiteness, we will work with sequences d such that di <2 for all i > 2; such 
a sequence will represent the real number do~di + J2^2 2^“*(di — 1). We will use 
the meta-notation IsReal(/) to mean that / G {nat not]® represents a real 
number in this way, and write Val(/) to denote the real number it represents. 
Note that there will be many sequences representing any given real number 

— this is in fact an essential feature of any representation of reals for which 
even the most basic arithmetical operations are computable. The above choice 
is essentially a signed binary representation involving infinite sequences of digits 

— 1, 0, 1 (coded in PCF by 0, 1, 2 respectively). 

We can make precise the idea of implementing SP in terms of SP' by means 
of a derived signature morphism S : S ^ S' . For the basic types, we take 

6{real) = nat — > nat, 5{bool) = nat. 

Next, for each term constant c : ct of 27 we need to give a term 6(c) : S(a) in 
27'. For the constants 0 and 1, this can be done just by choosing one particular 
representing sequence for these real numbers, e.g. 

i5(0) = Xi'.nat. 1, 5(1) = Xiinat. ifzero i 2 1. 

For the booleans, we take 6{true) = 0, 5{false) = 1 and 5(T) = T. It is also 
straightforward to write PCF programs Minus, Plus, Times, Max, Less for 5(— ), 
5(-|-), 5(*), 5{max) and 5(<) respectively. For example, we may take 

Minus = Xf : nat — > nat, i-.nat. 

if z = 0 then /(I) else if z = 1 then /(O) else 2 ^ /(z) 

where implements truncated subtraction. In any model B, this satisfies the 
following condition (which should be understood as a meta-level assertion) : 

V/ G \nat nat]®. IsReal(/) => 

IsReal(|MznMs]®/) A Val(|MznMs]®/) = — Val(/). 
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Coding details for the other operations are given e.g. in [Phi98j . What is more 
surprising is that the operation sitpjg can be represented in PCF by a third- 
order function Sup, by means of a clever use of higher-type recursion (a detailed 
account of the algorithm with code is given in |Sim98p . 

Proposition 5.1. SP' is a constructive pre-logical refinement of SP via S. 

Proof sketch. Starting from any B G Mod(SP'), we will obtain a model A G 
Mod(SP) and a pre-logical relation TZ. 

The correct definition of A is slightly subtle — the whole point is that the 
obvious definition via a logical relation on B does not work (see below). First, we 
embed B in its chain completion B via an inclusion l (we omit the definition). 
The main purpose of this step is to throw into the model all monotone functions 
of type nat — > nat — this ensures that in B we can represent all the classical 
reals and not just the computable ones (cf. Section \5.4\ below). One can check that 
if B is a model of SP' then so is B. Next we define partial equivalence relations 
E'^ on |<5(cr)]® for each a G Types{E). For the base cases, we take 

j^reaif^f^g) iff Is Real{f) A IsReal{g) A Val{f) = Val{g), 

E^°°\x,y) iffx = yAxG {|0]®, |1]®, |_L]®}. 

(The latter clause means that E behaves as a partial injection for observable 
types.) We lift this to higher types as a binary logical relation on B. One can 
show that for each constant c of S we have if(t|(5(c)]®, t|<5(c)]®). 

We now construct the required model A by taking Icr]"^ to be the set of equiva- 
lence classes of E'^ ; one can check that A yields a Flenkin model for SP. Finally, 
we define relations R'^ from |cr]^ to |i5(tT)]® by R'^{a,b) iff t{b) G a; clearly this 
defines a pre-logical relation TZ as required. □ 

5.4 Lack of a Constructive Logical Refinement 

We now explain why SP' is not a constructive logical refinement of SP via 6. 
Intuitively, the idea is that a logical relation TZ is completely determined once we 
have fixed the relation at basic types — we have no freedom of choice for higher 
types. For certain models B of SP' , this means that we are forced to include in 
the relation n^eai^reai highly pathological elements of I6{real reaZ)]®, 

and our PCF implementation of sup^Q will fail to work for these pathological 
elements. This leads to a contradiction since we require i?(|sMp[o,i]l^j I'S'wp]®) 
for some model A of SP. 

More precisely, let us take B to be some effective model of SP', such as the 
effective Scott domain model |PE77| or the term model for PCF. All that we 
really require is that the elements of {nat — > nat]® correspond to just the partial 
recursive functions Nj_ ^ Nj_ . We will show the following: 

Theorem 5.2. There is no model A G Mod{SP) admitting a logical relation TZ 
over A and B\s which is a partial injection on bool. 
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Proof sketch. The proof of the theorem hinges on the existence of a pathologi- 
cal PCF implementation of the constant zero function: that is, a term Funny : 
{nat — > nat) {nat nat) such that 0]'^, [Funny]^), but 

such that {Sup Funny]’^ = [Barf]®, where Bad = Xy.nat. if y <2 then 1 else _L. 
Since SP entails that sup^Q -^{Xx\real. 0) = 0, we have |Ba<i]®), which 

can be shown to be impossible. 

The idea behind Funny is based on the Kleene tree, a well-known counterex- 
ample from recursion theory (see e.g. JBeeSf^ ). Intuitively, Funny{f) gives 0 
whenever f represents a recursive real number but diverges for certain non- 
recursive reals. □ 

Notice how the pre-logical relation K in the proof of Proposition 15.11 avoids 
this problem: the interpretation of Funny in B is not included in the partial 
equivalence relation Ereai^reai ^ since the model contains representations of non- 
recursive reals on which Funny diverges. 

The above example is robust in the sense that it is not just a feature of the 
particular implementation Sup we have chosen — it can be shown that there is 
no PCF program Sup that computes suprema for all relevant functions including 
Funny. Indeed, we believe that the above theorem should hold for all possible 
representations of the reals and all choices of the terms S{c): the only condition 
on S we require is that 5 {real) = nat — *■ nat. 



6 Conclusion 

The main purpose of this paper was to introduce the notion of constructive 
pre-logical refinement and explain how it relates to the usual account of data 
refinement for typed lambda calculus in terms of logical relations. In a nutshell, 
the relationship is that for data refinement logical relations work only because 
they are a special case of pre-logical relations, where the additional requirement 
imposed by logical relations is more of a hindrance than a help. 

There are many directions in which this approach could be developed. 

In Sect. |4]we considered linear chains of refinement steps. Decomposition of 
implementation tasks into separate subtasks can be modelled using construc- 
tions that take n-tuples of Henkin models as arguments, giving tree-shaped 
refinement diagrams. In particular, consider 6 : S ^ {S[ -|- . . . -I- XJ'n), where 
S[-\-. . coproduct of the signatures . . . , E’^. This induces the reduct 

•ji : Mod{E[ -I- ... -I- E(^) Mod{E). However, this does not give an n-ary con- 
struction, since Mod{E[-\- . . and Mod{E{)x - ■ - x Mod{E{,) do not coincide 

even up to isomorphism; in other words, higher-order models do not amalgamate 
unambiguously. However, they weakly amalgamate: there is a standard (injec- 
tive) construction that maps Mod{E[) x • • • x Mod{E'^) into Mod{E[ -\- . . .-\- E'.^) 
(e.g. by taking full function spaces for extra “mixed” function types). Composing 
this with -| 5 , we obtain a function from Mod{E[) x • • • x Mod{E() to Mod{E) 
as required. This still ignores one important aspect of development, namely the 
possibility of mutual dependencies between subtasks. One solution, discussed 
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thoroughly in |SST92] , is to use specifications of parametric models in the devel- 
opment process; the same ideas should apply here, but the technical implications 
of using higher-order models are yet to be worked out. 

This paper presents a global view of specifications and their refinement: con- 
structions are required to work on the “whole system” (represented as a model 
of the implementing specification) and produce a whole system (represented as 
a model of the implemented specification). Good practice suggests that there 
should be a way to make the refinement steps “local” — that is, to use only part 
of the system built so far to implement some remaining parts of the requirements 
specification, and then add the result to the whole system built so far. Details 
will be provided in a longer version of the paper. 

In this paper we have focused on only. But of course, less elementary 
type structures are also of great importance in software development using data 
refinement. One can consider inductive/coinductive datatypes, or more gener- 
ally recursive types as in ML, or impredicative types as in Girard/Reynold’s 
System F. For instance exact real numbers as in Sect. are often implemented 
as streams for efficiency reasons, also in purely functional contexts, and abstract 
data types can be understood in the context of existential types. Notions of log- 
ical relations, appropriate for each of these type disciplines, have been proposed 
in the literature: see e.g. [Alt98| for inductive/coinductive types and |M M85] for 
System F. In order to accomodate data refinement involving such datatypes we 
need to introduce corresponding notions of pre-logical relation. As pointed out 
in |HS99j . there is a standard methodology here: simply require the interpre- 
tations of the “relevant” constants in the two structures to be related. Despite 
its simplicity, this methodology is extremely rewarding, and it allows to har- 
vest seredipitous results also in related areas. A case in point is offered by PER 
models of System F, where the extra latitude and flexibility given by defining 
the exponential PER pre-logically allows for a number of possibly novel natural 
model constructions. Finally, a notion of pre-logical relation for System F would 
raise the intriguing question of the relationship between this framework and the 
one in |Han99] . where data refinements in the style of |ST88J are translated into 
System F using existential types. 
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Abstract. We relate several models of concurrency introduced in the 
literature in order to extend classical Mazurkiewicz traces. These are 
mainly Droste’s concurrent automata and Arnold’s CCI sets of P-traces, 
studied in the framework of local trace languages. Also, a connection 
between these models and classical traces is presented in details through 
a natural notion of projection. These relationships enable us to use ef- 
ficiently Arnold’s result in two other frameworks. First, we give a finite 
distributed implementation for regular CCI sets of P-traces (or, equiva- 
lently, finite stably concurrent automata) by means of bounded labelled 
Petri nets. Second, we present a new, simple and constructive method to 
relate Stark’s trace automata with Bednarczyk’s asynchronous transition 
systems. This improves a recent result in Scott domain theory. 



Introduction. Mazurkiewicz trace languages are a well-known and widely stud- 
ied model of concurrency [1]. They were introduced in m to provide a partial 
order semantics for elementary Petri nets. In the past decade several differ- 
ent generalizations of classical traces have been studied in the literature. First, 
Droste introduced concurrent automata [5] for which the independence between 
actions is no longer a global independence relation, but depends on the current 
state of the system. These automata were shown to extend Bednarczyk’s asyn- 
chronous transition systems |2] and Stark’s trace automata [18]. Independently, 
Arnold introduced an extension of classical traces by means of labelled partial 
orders called P-traces |1]. In particular, a strong connection between recogniz- 
able classical trace languages and regular CCI sets of P-traces was established. 
More recently, local trace languages were introduced to give a trace semantics 
for Place/Transition nets |8|14j . There a local independence relation specifies in 
each configuration which subsets of actions can be executed concurrently. 

At some point, it seems necessary to classify and relate the different models 
of concurrency arisen in the literature. For instance, the synthesis problem of 
Petri nets consists in characterizing which automata (or languages) correspond 
to the behavior of a Petri net [znas]. More generally, semantical studies bring 
relationships between models of different levels of abstraction [■iOl'iUblllj . 

In this paper, we relate three models of concurrency which are roughly at 
the same level of abstraction. These are CCI sets of P-traces, stably concurrent 
automata and a restricted subclass of local trace languages called stable trace 
languages. The latter are also precisely compared to classical trace languages by 
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J. Tiuryn (Ed.): FOSSACS 2000, LNCS 1784, pp. ITT- THm 2000. 
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means of projections. We show that these relationships lead to some improve- 
ments for the theories of Petri nets, concurrent automata and dl-domains. 

After some basic definitions relating recognizable local trace languages and 
Mukund’s step transition systems unj, we introduce the subclass of stable trace 
languages with the help of some cube properties. The latter are actually meant 
to mimic the particular behaviors of stably concurrent automata. In that way, 
recognizable stable trace languages are easily shown to correspond to the behav- 
ior of finite stably concurrent automata. Next we focus on CCI sets of P-traces 
which are shown to be equivalent to some stable trace languages. Therefore they 
represent the behavior of stably concurrent automata. Also regular CCI sets of 
P-traces are associated to recognizable stable trace languages. Thus we obtain 
precise relationships between these three models. 

These connections lead us to give a new formulation of a strong result due 
to Arnold [H Th. 6.16] showing that these extensions of classical traces are 
closely related to the original model: any recognizable stable trace language is 
the projection of a recognizable classical trace language. This relationship holds 
also for non-recognizable languages over infinite alphabets. However, answering 
an open problem raised by Arnold, we prove that this relationship fails in the 
case of non-recognizable stable trace languages over finite alphabets. This relies 
on a counter-example provided by a Producer-Consumer system. 

In a seminal paper |21| . Zielonka proved that any recognizable classical trace 
language is described by an asynchronous automaton which provides a finite 
implementation in the form of distributed processes. In [J, Arnold introduced 
an extension of Zielonka’s asynchronous automata, called P-asynchronous au- 
tomata. However these systems failed to describe all regular CCI sets of P-traces. 
Besides, it is still an open problem to know which regular CCI sets of P-traces are 
described by P-asynchronous automata (obviously these are not the whole class 
of regular CCI sets of P-traces, see m for a counter-example) . In order to avoid 
this restriction, we present a construction of a finite distributed implementation 
for any recognizable stable trace language (or any regular CCI set of P-traces) in 
the form of a labelled Petri net. This construction turns out to complete nicely a 
somewhat dual approach followed by Droste and Shortt [S] . There the Petri nets 
whose behavior corresponds to a stably concurrent automaton (or a stable trace 
language) are characterized by some simple conditions on the weight function. 

In |1 7j , Schmitt tackles the difficult problem to define a recognizability notion 
for coherent dl-domains. The basic idea is that a coherent dl-domain should be 
considered recognizable if it corresponds to the behavior of a finite distributed au- 
tomaton. However several families of distributed automata might be considered 
and might give rise to different recognizability notions. The main result of HD 
asserts that the coherent dl-domains obtained from either finite trace automata 
m or finite asynchronous transition systems |5] are the same. We present here a 
new, simple and constructive proof of this result — whereas Schmitt’s approach 
is not constructive. 

The proofs of our main results partly rely on technical results borrowed from 
jl] and P]. A detailled study is available in [TU]. 



On Recognizable Stable Trace Languages 179 



1 Basic Notions 

Preliminaries. We will use the following notations: for any (possibly infinite) 
alphabet E, and any words u S E*, v € E* , we write it < v if u is a prefix of v, 
i.e. there is z & E* such that u.z = v, the empty word is denoted by £. We write 
|u|a for the number of occurrences of a G If in it £ E* and pf{E) denotes the 
set of finite subsets of E\ for any p G pf{E), Lin(p) = {u £ p* | Va G p, |it|a = 1} 
is the set of linearisations of p. Finally, if A : 17 ^ If' is a map from E to 17', 
we also write A : 17* ^ 17'* and A : p/(I7) ^ Pf{^') to denote the naturally 
associated monoid morphisms. For short, a right semi-congruence will be called 
right-congruence . 

Local Independence Relations and Local Trace Languages. As estab- 
lished in [HITT] , the behaviors of Petri nets are faithfully represented by local 
trace languages. These are a generalization of the classical Mazurkiewicz’ traces 
m since they specify sets of independent actions rather than pairs. 

Definition 1.1. A local independence relation over E is a non-empty subset I 
of E* X pf(E). The (local) trace equivalence ~ induced by I is the least equiva- 
lence on E* such that 

TEi .• Vm, u' G 17*, Va £ E,u ^ u' ^ u.a ^ u' .a; 

TE 2 .‘ y{u,p) £ /,Vp' C p,Wvi,V 2 £ Lin(p'), M.tii ^ u.V 2 - 
A (local) trace is an ^-equivalence class [u] of a word u £ 17*. 

ByTEi local trace equivalences are right-congruences. TE 2 asserts that for every 
subset of actions which are independent after a sequence u, all sequences obtained 
by executing first u and then in an arbitrary order the actions from this subset, 
are equivalent. Note also that local trace equivalences are Parikh equivalences: 
M ~ u' ^ Vo G 17, |m|q = \u'\a- 

These assumptions on the trace equivalence can be translated into explicit 
additional conditions on the local independence relation without affecting the 
resulting traces. A local independence relation satisfying these additional condi- 
tions is called complete and can be shown to be a maximal representative among 
local independence relations defining the same behaviors. 

Definition 1.2. A local independence relation I over E is complete if 
Cpli.' (u,p) G / A p' Cp ^ {u,p') £ I; 

Cpl 2 .' (u,p) £ I A p' C p A V £ Lin(p') => {u.v,p\p') £ I; 

Cpl 3 .' (u, { 0 , 6 }) £ I A {u.ab.v,p) G / (u.ba.v,p) £ I; 

Qp\^: (11.0,0) G / (m, {o}) G I. 

Cplj makes explicit what TE 2 from Def. [O] guarantees for the trace equivalence: 
if a set of actions p can be executed concurrently after u, then so can any subset 
of p; moreover, following Cpl 2 , the step p can be split into a sequential execution 
V and a concurrent step of the remaining actions. We remark now that CPI 3 is 
equivalent to the requirement that u ^ u' A {u,p) £ I ^ {u' ,p) £ I- Thus CPI 3 
states that after two equivalent sequences the independency of actions is the 
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same; it corresponds to the right-congruence property TEi from Def. 11.11 Local 
independence relations satisfying CPI3 were called consistent in | 16 j and durable 
in [^. Finally Cp^ guarantees that whenever u.a is a sequential execution, then 
action a is allowed as a step after u. 

In this paper, we study the local trace languages introduced in El as combi- 
nations of a complete local independence relation and a language of sequences. 

Definition 1 . 3 . A local trace language over S is a structure C = {S,I,L) 
where I is a complete local independence relation on S and L C if* is such that 
u e L {u, 9 ) G I. 

Note here that the set of sequences L is closed for the prefix relation and the 
trace equivalence. Moreover any local trace language is entirely determined by 
its associated local independence relation. 

Global Independence Relations and Maznrkiewicz Traces. Local trace 
languages are actually a direct generalization classical traces E0. There, the 
independence between actions does not depend on the context of previously 
occurred events. Thus we consider a global independence relation over S to be 
a binary symmetric and irreflexive relation |j C If x if. Then a classical trace 
language over (If, ||) consists of a language L C If* which is closed for the 
commutation of independent actions: Vu, v G If*, Vo, 6 € If, u.ab.v G L A o||6 => 
u.ba.v G L. 

In order to connect this approach with local trace languages, we will only 
consider here prefix-closed languages. In that way any classical trace language can 
be formally identified with a local trace language C = (If, J, L) for which {u,p) G 
I if the actions in p are pairwise independent w.r.t. the global independence 
relation. This leads us to introduce formally Mazurkiewicz trace languages within 
the general framework of local trace languages as follows. 

Definition 1 . 4 . Let || be a global independence relation over If. A Mazurkie- 
wicz trace language over (If, ||) is a local trace language L — (E,I,L) such that 
Vm e If*, Vn e IM, Voi, ..., o„ G If.- 

(m, {oi, ..., o„}) G I AA u.ai...an G L A \/i,j G [l,n] distinct, ai\\aj . 

Now associating any prefix-closed classical trace language L over a fixed indepen- 
dent alphabet (If, ||) to the Mazurkiewicz trace language C = {S,I,L), where 
I is defined as in Def. 01 we build clearly a one-to-one eorrespondence between 
prefix-closed classical trace languages and Mazurkiewicz trace languages. 

Despite of this nice formal connection, we should stress here that the local in- 
dependence relation associated to a Mazurkiewicz trace language may have some 
unusual (but technically necessary) properties. In particular, if the language L 
is not forward-closed w.r.t. the global independence relation || then there are a 
word u and two actions a and b such that u.a G L, u.b G L, a||6 but u.ab ^ L; in 
that case, a and b are not independent after u\ {u, {a, 6}) ^ I. 

Recognizable Languages and Finite Step Transition Systems. The model 
of step transition systems was introduced by Mukund [IS] in order to extend the 
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so-called synthesis problem of elementary Petri nets [7] to the more general 
model of Place/Transition nets. 

Definition 1.5. A step transition system over the alphabet S is a structure 
A = (Q, s, E, — >) where Q is a set of states, s & Q is an initial state and 
— >C Q X pf{E) X Q is a set of labelled transitions such that 

- Vgi, 92 e Q: qi q2 ^ Qi = 92/ 

- V91, 92 GQ,\/p' Cpe pf{E): 9i ^ 92 ^ 393 G Q, 91 ^ 93 ^ 92/ 

- V91, 92, 93 GQ,ypG Pf{E): 9i ^ 92 A 91 ^ 93 ^ 92 = 93. 

The step transition system A is finite if E and Q are finite. 

As usual, for any word u = ai...a„ G E*, we write 9 q' if there are states 

9o,..., 9n such that 90 = q, qn = q' and for each i G [l,n], 9i_i 9^. Let us 
also stress here that we only consider deterministic step transition systems. This 
is actually meant to make sure that the local independence relations intuitively 
associated to them are complete — in particular, they satisfy CPI3. 

Definition 1.6. The local trace language associated to a step transition system 
A = (Q, s, pf(E), — >) is the structure L = {E, I, L) where 

- Vu G A*.- u G L 39 G Q, s — ^ 9; 

- Vu G E*, Vp G pf{E): (u,p) el ^ 391,92 e Q,s qi ^ 92. 

Step transition systems define naturally a notion of recognizability which 
extends a similar notion well-known and widely studied in the case of classical 
language theory or classical trace languages. 

Definition 1.7. A local trace language is recognizable if it is the language of a 
finite step transition system. 

Note here that if £ = {E, /, L) is recognizable then L is a recognizable language 
of E*, but the converse is false — except, e.g., for Mazurkiewicz trace languages 
over finite independent alphabets. 



2 Stable Trace Languages 

We introduce in this section the subclass of stable trace languages. These later 
generalize Mazurkiewicz traces and Nielsen, Sassone and Winskel’s generalized 
trace languages m- 

Cube Properties in Local Trace Languages. Stable trace languages are 
characterized by cube properties that can be formalized as follows. 

Definition 2.1. A stable trace language is a local trace language C = {E, /, L) 
such that 

Si.' Vm G E* , Vn > 2, Voi, ...,a„ G E distinct: 

[Vcr : [l,n] ^ [l,n] onto : u.ai...a„ ~ u.aa(i)...aa(n)] ^ (u, {ai, ...,a„}) G I 
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S 2 -' Vw € S* , Va, b,c & S distinct: 

[(m, {a, c}) e / A {u.a, {b, c}) G /] => [(u, {a, b}) G I ^ [u.c, {a, 6}) G /] 

Condition S i asserts that whenever a set of actions may be executed in any order 
after a given sequence without affecting the resulting trace then these actions 
are mutually independent. Note here that the converse always holds. Therefore 
Si means simply that the independence relation I is somehow determined by its 
trace equivalence ~. Now, the second condition S2 requires that the concurrency 
between actions satisfies some local properties. As explained by the following 
proposition, this insures that the set of traces of a stable trace language satisfies 
some cube properties (CP) similar to those used to characterize stably concurrent 
automata. 

Proposition 2.2. Let C = {S,I,L) be a local trace language satisfying Si. In 
the following diagrams, for all u,v G S* , we note [u] — > [t] if there is a G S 
such that u.a ^ v. The language C is stable iff^u G S* , ya,b,c G S distinct: 




It is clear that any Mazur kiewicz trace language is a stable trace language. 
Let us also mention here that Nielsen, Sassone and Winskel’s generalized trace 
languages m can be identified to the stable trace languages which satisfy the 
following additional coherence property: if {u,{a,b}) G I, (u, {a, c}) G / and 
{u, {b, c}) G / then {u, {a, b, c}) G I. 

Stably Concurrent Automata. We present now the very natural connection 
between stable trace languages and stably concurrent automata. 

Definition 2.3. m An automaton with concurrency relations over the alphabet 
S is a structure A = {Q, s, S, — >, (||q)qgQ) such that 

1. Q is a non-empty set of states, with an initial state s; 

2. — >C Q X S X Q is a set of transitions assumed deterministic, i.e. whenever 
p — > q and p — > r then q = r; 

(ll 9 )geQ ® family of irrefiexive, symmetric binary relations on S; it is 
required that whenever a\\pb then there exist transitions p q, p — ^ q' , 
q — ^ r and q' — ^ r. 

Note that we only consider automata with concurrency relations provided with 
a single initial state. On the other hand, the set of states and the alphabet 
may be infinite. The language L associated to an automaton with concurrency 
relations is the set of finite sequences u = ai...a„ G S* such that there are 
states qo,...,qn for which s = qo and for each i G [l,n], qi-i — A q^. For short, 
these conditions will be denoted by s — ^ qn. Now the independence relations |jg 
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provide naturally an equivalence relation over L as follows. The trace equivalence 
~ associated to A is the least equivalence over L such that Vu, v S E* , Va, b G E: 

u ab V . n , j , 

s — > p — > q — > r A a\\qb => u.ab.v ~ u.ba.v. 

For many different reasons, it appears that one may expect the independence 
relations ||g to depend locally of each other. In that way, a particular attention 
has been devoted to stably concurrent automata. In the following definition, for 
all actions a, b and c, and for all state q, we note a\\q,cb if there exists a state 
q' & Q such that q — ^ q' and a\\qib. 

Definition 2.4. A automaton with concurrency relations A is called stably 
concurrent automaton if for all q G Q and all actions a, b, c G E , the following 
equivalence holds: a\\qC A b\\qC A a\\q,cb ^ a\\qb A b\\q,aC A a\\q,bC. We say that 
A is finite if Q and E are finite. 

A fundamental property of stably concurrent automata is the following cor- 
respondence between the trace equivalence ^ and the family of independence 
relations {\\q)qeQ- Vu £ E*, \/a,b G E distinct, u.ab ^ u.ba s — > q A a\\qb. 
Therefore the assumption on (|jq)qgQ in Def. 12.41 corresponds precisely to the 
cube properties (CP) of Prop. 12.21 Also, the independency of actions is entirely 
determined by the trace equivalence. This remark lead us to represent the be- 
havior of stably concurrent automata by stable trace languages as follows. 



Definition 2.5. Let A be a stably concurrent automaton over E, L be its lan- 
guage and ~ be its trace equivalence. The stable trace language associated to 
A is C{A) = {E,I,L) where Vu G E* , Vn G IN, Vai,...,a„ G E distinct: 



{u,{ai, 






J u.ai...a„ £ L 

[Vcr : [l,n] — > [l,n] onto : u.ai...a„ 



. . ■Q'o'(n) 



We easily check that C{A) is indeed a stable trace language. Moreover the re- 
striction of the trace equivalence of C{A) to L is precisely the trace equivalence 
of A. We stress that C{A) is a representation of the behavior of A equivalent to 
the labelled dl-domain usually considered (see e.g. [3]). Furthermore any stable 
trace language is the language of a stably concurrent automaton. Besides a 
stable trace language is recognizable if and only if it is the trace language of a 
finite stably concurrent automaton. 



Full Stable Trace Languages Are Stable Right-Congruences. Although 
stable trace languages play a central role to relate stably concurrent automata 
with CCI sets of P-traces, we need to introduce first an equivalent representation 
in the form of particular right-congruences. 



Definition 2.6. Let ^ he a right- congruence over E* . The associated diamond 
relation is the least right- eongruence over E* such that 

Vu £ E* , Va, b G E, u.ab ~ u.ba => u.ab u.ba. 

We say that the right- congruence ~ is homotopic 

It is clear that for all right-congruence ~, The converse inclusion holds 

in particular for the trace equivalence of any local trace language which is thus 
a homotopic right-congruence. 
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Definition 2.7. A right- congruence over S* is stable if it is homotopic and 
satisfies axiom (CP) of Prop. \2.A whenever a,b,c G S are distinct and u G S* . 

Clearly, the trace equivalence of a stable trace language is a stable right- 
congruence. However, different stable trace languages may determine the same 
trace equivalence. That is why we focus now on full local trace languages. The 
latter are defined as the local trace languages C = {E, /, L) such that L = H*. 

Proposition 2.8. An equivalence relation over E* is the trace equivalence of a 
stable trace language if and only if it is a stable right- congruence. Moreover, in 
that case, it is the trace equivalence of a unique full stable trace language. 

3 CCI Sets of P-Traces 

We show here a one-to-one correspondence between Arnold’s CCI sets of P- 
traces [T] and full stable trace languages. Moreover, regular CCI sets of P-traces 
correspond to recognizable full stable trace languages. 

P- Traces. In this section we consider a fixed alphabet E. Note here that we 
shall consider a slight extension of Arnold’s approach since E may be infinite. 

Definition 3.1. EH A P-trace t over E is a triple {Et, where {Et, ^t) is 

a finite partial order and is a mapping from Et to E such that for all x,y G Et, 
it{x) = f,t{y) ^{x<tyory <t a:)- 



Definition 3.2. A linear extension of a P-trace t = {Et, -<t,ft) is a total order 
-< over Et such that ^tC^. 

Now, linear extensions of a P-trace t can easily be identified to words over 
E. Formally, let n be the cardinal of Et. For any linear extension ^ of t, there 
is only one way to write Et = {ei, ..., e„} with et ^ Cj i < j . Then the word 
associated to ^ is ^t(ei)...^t(e„). Clearly, this mapping from linear extensions of 
t to words is one-to-one. In the following, we shall identify any linear extension 
of t with its associated word. 

Definition 3.3. Let t be a P-trace over E. We note LE{t) the set of all the 
words associated to a linear extension oft. 

P-traces are naturally structured with a notion of isomorphism: two P-traces 
t = (Et,~<t,£,t) and t' = {Ef , ~<t' , ft') are isomorphic if there is a bijection cr 
from Et to Et' such that 

- 'ix,y G Et'. X <ty o-(x) ^t' cr(y); 

- Vx G Et : ft(x) = ft'(cr(x)). 

Clearly, two isomorphic P-traces admit the same linear extensions. Noteworthy 
is the converse property due to Szpilrajn m- 

Proposition 3.4. Two P-traces t and t' are isomorphic iff LE{t) = LE{t'). 
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CCI Sets of P-Traces Are Stable Right- Congruences Too. As in the 

classical case, a P-trace is meant to represent one concurrent execution of a 
distributed system. In order to describe all the possible behaviors of a system, 
one has to consider sets of P-traces. 

Definition 3.5. m Let tP be a set of P-traees over S . We say that P is eon- 
sistent and eomplete if 

— = S* [Complete] 

— Vt, t' ^P, t^t' => LEft) n LEft') = 0 [Consistent] 

Each consistent and complete set of P-traces determines an equivalence relation 
~ over E* whose equivalence classes are the linear extensions of its elements. 
This equivalence will be called the trace equivalence of P. 

However, this equivalence relation is sometimes not a right-congruence, which 
is admittedly still a natural assumption for traces. That is why, following Arnold, 
we focus on ideal sets of P-traces. These are defined according to the following 
partial order of P-traces. 

Definition 3.6. We say that a P-trace t = {Et, is a prefix of a P-trace 

t' = {Et', if the following conditions are satisfied: 

— Lit C Et' 

— Va; G Et,^t{x) = 

— <t=<t' P{Et X Et); 

— Vx e Et, Vy G Ef : y ^t’ X ^ y G Et. 



Definition 3.7. A set of P-traces over E is ideal if for all t G P, ift' is a prefix 
of t then t' G P. A complete, consistent and ideal set of P-traces will be called 
CCI for short. 

Useful consequence of [T], Prop. 3.1 and 3.3], our first result relates CCI sets 
of P-traces and stable right-congruences as follows. 

Theorem 3.8. An equivalence relation over E* is the trace equivalence of a 
CCI set of P-traces if and only if it is a stable right- congruence (Def. |^.7| ]. 

Thus any CCI set of P-traces describes a stable right-congruence and con- 
sequently it is associated to a uniquely determined full stable trace language 
(Prop. [2^ . Conversely, any (full) stable trace language can be associated to a 
CCI set of P-traces which is essentially unique up to the natural isomorphism 
notion defined as follows. We say that two sets of P-traces P\ and P 2 are isomor- 
phic if there is a bijection a from Pi to P 2 such that for all P-trace t G Pi, aft) 
and t are isomorphic P-traces. Clearly, two CCI sets of P-traces are isomorphic 
iff their associated trace equivalences are equal. Thus, up to an isomorphism. 
Prop. ESI and Th. 13.81 show that each stable trace language can be associated to 
the unique CCI set of P-traces which determines the same trace equivalence. 

Now the behaviors of stably concurrent automata are not full stable trace 
languages — except if one provide them with an additional sink state. Thus the 
traces of a stably concurrent automaton are described by a consistent and ideal 
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set of P-traces (it is complete only if its language is if*). This result completes 
actually some similar connections established independently in [^. 

Regular CCI Sets of P- Traces vs Recognizable Languages. Theorem ld.8l 

and Proposition 12.81 establish a one-to-one correspondence between CCI sets of 
P-traces and full stable trace languages. We explain here that this relationship 
also holds between recognizable stable trace languages and regular CCI sets of 
P-traces. The latter were introduced by Arnold as follows. 

Definition 3.9. Let IP be a CCI set of P-traces over a finite alphabet S and 
let ~ be its associated trace equivalence. We consider the equivalence relation = 
over E* such that u = v */ Vw, w' G E*, u.w ~ u.w' v.w ^ v.w' . 

The set P is called regular if the equivalence = is of finite index. 

Using [3] Prop. 2.7], we can now complete Prop. [TS] and Th. 13.81 as follows. 

Proposition 3.10. A CCI set of P-traces is regular if and only if its associated 
full stable trace language is recognizable. 

4 Stable Trace Languages vs Mazurkiewicz Ones 

We now show how stable trace languages relate to Mazurkiewicz ones. We ex- 
plain that stable trace languages form a true generalization of Mazurkiewicz trace 
languages through the particularly useful example of a Producer-Consumer sys- 
tem. However, any stable trace language may be regarded simply as a labelled 
Mazurkiewicz trace language. This will be formalized here by a notion of pro- 
jections. 

Projections of Local Trace Languages. We first recall the natural structure 
of local trace languages by morphisms introduced in m- 

Definition 4.1. Let C = (E^I^L) and C! = (E' , I\ I') be two local trace lan- 
guages. A morphism A from C to C is a may X : E E' such that 

- V(u,p) G I, (A(u),A(p)) G 

— V(u, {a, 6}) G I: a ^ b ^ A(o) yf A(6). 

Note that if two distinct actions a and b are independent after u then their 
images should be independent after A(u) in order to respect concurrency: that 
is why we require that A(a) A(6). Clearly if ui and U 2 are trace equivalent 
according to I then A(ui) and A(m 2 ) are trace equivalent according to 

In this paper, we introduce particular morphisms which insure several nice 
correspondences between the related local trace languages. 

Definition 4.2. A projection from L = (E,I,L) to C' = {E' ^I\L') is a mor- 
phism X : C C' whose underlying map X \ E —> E' is onto and such that 
V(m',p') G 3\{u,p) G I, X{u) =u' f\ X{p) = p' . 

We remark first that the trace equivalence is faithfully preserved and reflected by 
projections. Moreover there is a one-to-one correspondence between the traces 
of C and those of C . 
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Lemma 4.3. Let X be a projection from C = {E, /, L) to C = {E' , I ' , L'). Then 
X : E* E'* induces a hijection between L and L' . Moreover, for all u\, U 2 in 
L, Ml ~ U2 A(ui) ~ A(m2). 

Therefore, projections of local trace languages should be regarded as simple 
and faithful labellings. If A is a projection then we will say that CJ is the 

image of C through the projection A. It is clear that the image of a recognizable 
local trace language through a projection is recognizable. 

Projections of Mazurkiewicz Trace Languages. The connection between 
Mazurkiewicz trace languages and stable trace languages is first established by 
Theorem l4.4l below. It asserts that any stable trace language is the projection of a 
Mazurkiewicz trace language. This result can be established by means of known 
relationships between stably concurrent automata, prime event structures, and 
dl-domains p2pn] — at least if we assume that all alphabet is countable. How- 
ever, a direct proof can be achieved without this assumption. It follows in fact 
the same basic idea since it relies on equivalences of prime intervals 

Theorem 4.4. A local trace language (over a possibly infinite alphabet) is sta- 
ble if and only if it is the image of a Mazurkiewicz trace language through a 
projection. 

The connection between projections of Mazurkiewicz languages and stable 
languages expressed in the preceding theorem also applies to the subclasses of 
recognizable languages (over finite alphabets) . This very interesting result will be 
used in the two last sections of this paper. It is a direct reformulation of Arnold’s 
work [TJ Th. 6.16] with the help of Prop. I3.10L 

Theorem 4.5. A stable trace language is recognizable if and only if it is the 
image of a recognizable Mazurkiewicz trace language through a projection. 

The Producer-Consumer System. We are now interested by languages over 
finite alphabets. An open problem raised by Arnold [T] is to know whether each 
stable trace language over a finite alphabet is the image of a Mazurkiewicz trace 
language over a finite alphabet through a projectiorj^- We give a negative answer 
to this question through the example of a Producer-Consumer system. 

We consider the alphabet E = {p, c} where p represents a production of one 
item and c a consumption. The language of the system describes all the possible 
sequences for which at each stage there may not be more consumptions than 
productions. Formally, L = {u G E* | Vu < m, |m|p > jujc}- Thus p, pc, ppc and 
pep are sequential executions of the system. We now want to model a possible 
independency between the producer and the consumer. Provided that there has 
been already enough items produced, the producer and the consumer can act 
simultaneously. For instance, ppc ^ pep. This can be represented by the local 
independence relation / defined as follows: 

^ The question raised by Arnold dealt with CCI sets of P-traces, i.e. full stable trace 
languages. We leave it to the reader to adapt our counter-example accordingly. 
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— {u,0) e I u € L; 

— (u, {c}) G I ^ u G L A \u\p > |m|c + 1; 

— (u, {p}) G I AA u G L; 

— (u, {p, c}) € I AA u G L A |m|p > |u|c + 1- 

Clearly, £ = {S, I, L) is a stable trace language. We now prove by contradiction 
that £ is not the image of a Mazurkiewicz trace language over a finite alphabet 
through a projection. Let us assume that £ is the image of a Mazurkiewicz trace 
language £' = {S' ,1' ,L') over a finite independent alphabet {S' , ||') through a 
projection A. Let n denote the size of S' . We consider the sequence u = (p.c)”+^ 
consisting of n + 1 productions and n + 1 consumptions. There is a unique 
sequence v £ L' such that A('c) = u. Let us write v = oi. 61 . 02 . & 2 ---an+i-^n+i- 
Clearly, A(oi) = p and X{bi) = c for all i G [l,n+ 1]. We easily check that 
for any i G [l,n], we have 6 i|foi and for all j G [i + l,n], bi\\aj. Now there 
are ii,i 2 G [1)^] such that ii < 12 and bi^ = bi^ because Card(A7') = n. Hence 
bii ||oi 2 Contradiction. 

5 Distributed Implementation of Stable Trace Languages 

In this section, we establish that each recognizable stable trace language admits a 
distributed implementation in the form of a finite bounded Petri net. According 
to ProD. imi this result also holds for regular CCI sets of P-traces. Furthermore, 
this means that the labelled dl-domain of any finite stably concurrent automaton 
is also described by a finite bounded Petri net. 

We consider here the classical model of Place/Transition nets. 

Definition 5.1. A Petri net is a quadruple N = {S,T,W, Mm) where 

— S is a set of places and T is a set of transitions such that 5 C T = 0; 

— W is a map from {S x T) U {T x S) to IN, called weight function; 

— Min is a map from S to IM, called initial marking. 

Given a Petri net N = (S', T, W, Mi„), Marj^ denotes the set of all markings 
of N that is to say functions M : S — > IN; a step p G pf{T) is enabled at 
M G Marj^- if Vs G S, M(s) > case, we note M [p) M' where 

M'(s) = M(s) + (£ s) — W (s, t)) and say that the transitions of p may 

be fired concurrently and lead to the marking M^ A step firing sequence consists 
of a sequence of markings Mq,..., M„ and a sequence of steps pi,..., Pn G pf{T) 
such that Mo = Mi„ and Vfc G [l,n], Mfe_i [pk) M^. In that case, M„ is said 
reachable. 

Definition 5.2. A labelled Petri net is a structure (S, T, W, Mm, f) where (S, T, 
W, Min) is a Petri net and f is a map from T to an alphabet S such that for all 
firing sequence Min = Mq [pi) [p„) and all transitions t,t' G T: 

Mn [{t}) A Mn [{£}) A f{t) = f{t') ^t = t'. 



The restriction adopted for the labelling : T ^ S insures that two transitions 
enabled by a common reachable marking correspond to two distinct actions. In 
other words, the labelling is deterministic. 
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Definition 5.3. The local trace language associated to a labelled Petri net DSf = 
(S',T, VL, M„,^) is t(3^) = {E,I,L) where I = ^(p)) | (ti...tn,p) G 

T* xpf(T)A Min[{ti}) Ml... [{tn}) Mn[p)} and the Set of Sequential cxccutions 
is L = {u G E* I (u, 0) G /}. 

Let us now focus on finite Petri nets — that is to say with a finite number 
of places and transitions — which are also hounded, which means that there are 
only a finite number of reachable markings. It is clear that local trace languages 
of such Petri nets are recognizable. Using Th. 14.51 and Zielonka’s theorem 
we can establish the converse property for stable trace languages. Roughly, the 
proof proceeds as follows. Given a recognizable stable trace C, we consider a 
recognizable Mazurkiewicz trace language Cm = {Em,Im,Lm) over {Em, ||m) 
and a projection A : Cm ^ by using Th. 14.51 Then Zielonka’s theorem 
yields an asynchronous automaton A over {Em, ||m) recognizing Lm- We regard 
A as if all its states were final and describe its behavior by a (1-safe) Petri 
net labelled by f. Then the trace language of this Petri net includes Cm- The 
technical point is then to add some places and to adapt the weight function in 
order to restrict the behavior of the net to Lm, without affecting the independency 
of the transitions. Finally, the labelling of the final net is changed into A o 

Theorem 5.4. Any recognizable stable trace language is the local trace language 
of a finite hounded labelled Petri net. 



6 Asynchronous Transition Systems vs Trace Automata 

Motivated by domain theoretic considerations, Schmitt established in m that 
any finite stable trace automaton is covered by a finite asynchronous transition 
system — which thus describes the same coherent dl-domain. We explain here 
how Theorem [43] provides a new approach to prove easily this result and yields 
an algorithm for the construction of such an asynchronous transition system. 

Definition 6.1. Let (U, ||) be an independent alphabet. An independent au- 
tomaton over (11,11) is a structure A = {Q,s,E , — >, ||) where Q is a set of 
states, with initial state s G Q and — >C Q x E x Q is a transition relation such 
that q qi C q 92 => 9i = 92 • 

A trace automaton is an independent automaton which satisfies the Forward 
Diamond property FD.' 

FD; 9 ^ 9i A 9 92 A a||6 ^ 393 € Q, 92 93 A 91 93. 

An asynchronous transition system over {E, ||) is an independent automaton 
which satisfies FD and the Independent Diamond property ID; 

ID; 9 9i A 9i 92 A a||5 ^ 393 e Q, 91 93 A 93 92. 

We shall assume in this paper that all states of an independent automaton are 
reachabl^ We note that each trace automaton may be regarded as an automaton 

This means that V9 G Q, 3 u G E* , s — % 



2 



9- 
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with concurrency relation (Def. 12.31) for which a\\qb a||6A q q' /\ q — ^ q" ■ 
It is clear that each asynchronous transition system, regarded as an automaton 
with concurrency relations, is in fact a stably concurrent automaton. That is 
not true for trace automata in general (only one implication is fulfilled) . That is 
why Schmitt introduced stable trace automata as follows. 



Definition 6.2. A trace automaton A = {Q,s,S , — >, ||) is stable if for all 
states q,r G Q and for all actions a, b and c pairwise independent w.r.t. ||.' 



q 



abc 



r A 



acb 



r A 





9 



cba 



r . 



We remark here that a trace automaton is a stably concurrent automaton iff 
it is stable. Therefore any asynchronous transition system is a stable trace au- 
tomaton. In order to strengthen this trivial relationship between stable trace au- 
tomata and asynchronous transition systems, Schmitt used folding morphisms, 
which correspond somehow to projections. 

Definition 6.3. Let A = (Q, s, T", — >) and A! = (Q', s', if', — >') be two trace 
automata. A folding morphism from A to A! is a pair of maps a : Q Q' and 
A : if — > if' such that 

- cr(s) = s'; 

- qi-^ Q2=> cr{qi) ^ cr(g2); 

- qi 92 A qi > 93 A a b ^ \{a) yf A( 6 ); 

~ cr( 9 i) 92 392 G Q, 3a G if, 91 92 A A(a) = a'; 

- Wq G Q, q 9 ' A 9 9 " ^ [a ||6 <tA A(a)||'A(5)]. 

In that case, we say that A covers A' . 

We can now state the main result of [17] . 

Theorem 6.4. Any finite stable trace automaton is covered by some finite asyn- 
chronous transition system. 

Let us now present a new, simple and constructive proof of this result. 
Let A = (Q, s,if, — >, II) be a finite stable trace automaton. Viewed as a sta- 
bly concurrent automaton, it describes a recognizable stable trace language 
£ = (S,I,L). Applying Theorem 14.51 yields a recognizable Mazurkiewicz trace 
language £ = (Am, Im, Lm) over an independent alphabet {Sm, ||m) and a pro- 
jection Am : £m — *■ £• We consider Am = {Qm,sm, ^m, — ^m,Fm) to be the 
minimal automaton of Lm, where Fm denotes the set of final states. Since Lm 
is recognizable and prefix-closed. Am is finite and Fm — Qm- We also remark 
that Am satisfies the Independent Diamond property ID w.r.t. \\m, because Lm 
is closed for the commutation of independent actions. We consider the synchro- 
nized product A X Am = (Q x Qm, (s, sm),F x Sm, — >x, || x) where 

( 9 , 9m) “-^x ( 9 ', q'M) iff 9 ^ 9 ' A 9M ^ 9W A A(aM) = a 
and (o, om)!! X f^M) iff a|| 6 A flMllMf'M- We easily check that A x Am is a 
finite asynchronous transition system — once restricted to its reachable states. 
Moreover the pair : (q,qM) 9 and Ai : (a,aM) 1 — > a is a folding morphism 
from A X Am to A. 
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Let us stress finally that the construction of Am from A is essentially provided 
by Arnold’s proof of [H Th. 6.16]. One can actually deduce from this proof some 
upper bounds for the sizes of Em and Qm (w.r.t. the sizes of Q and E). This is 
definitively impossible when following Schmitt’s approach. 
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Abstract. We show that any relation between the simulation preorder 
and bisimilarity is EXPTIME-hard when systems are given as networks 
of finite state systems (or equivalently as automata with boolean vari- 
ables, etc.). We also show that any relation between trace inclusion and 
ready trace equivalence or possible-futures equivalence is EXPSPACE- 
hard for these systems. 

These results match the already known upper bounds and partially an- 
swer a conjecture by Rabinovich. They strongly suggest that there is no 
way to escape the state explosion problem when checking behavioural 
relations. 

For the branching-time relations, our proof uses a new construction that 
immediately applies to timed automata, a family of systems for which 
these complexity results are new. 



1 Introduction 

The model- checking approach to automated or computer-aided verification is 
now widely recognized as a promising development for system design, especially 
in the area of critical systems [CGL96j . The main practical limitation of model- 
checking is the well-known state explosion problem: the systems we check are 
built by composing several subsystems, they use variables and/or clocks, and a 
fiat equivalent transition system would have an exponential number of states. 
Therefore, even if model-checking fiat systems is tractable, verifying non-fiat 
systems has been a major challenge since the beginning. 

The state explosion problem can be considered from a pragmatic or from 
a theoretical angle. The pragmatical approach aims, e.g., at designing symbolic 
methods that may bypass the state explosion in many practical cases [T3CM+9^ . 
The theoretical approach studies the structural complexity of model-checking 
non-flat systems, i.e. systems described as combinations of finite-state compo- 
nents. The goal here is to understand better which verification problems have 
to face state explosion in an intrinsic way, which special way of combining sub- 
systems could avoid state explosion, and what are the theoretical limits of all 
approaches, even the best pragmatical ones. 

But what exactly are these non-fiat systems ? Different models exist: syn- 
chronized products of finite-state automata are a natural possibility, automata 
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acting on boolean variables are another one, as well as 1-safe Petri nets. From 
a structural complexity perspective, these brands of non-flat systems can all 
be succinctly encoded into each other and the complexity results hold robustly 
across many variant presentations. In this paper we consider synchronized prod- 
ucts of automata (see section [2|) but we keep the more general terminology of 
“non-flat systems” in this introduction. 



An overview of existing results. The literature is limited but the main questions 

have been answered: 

Classical verification problems: The complexity classes of the main ques- 
tions for non-flat systems, like reachability, termination, deadlock-freedom, 
etc., are known (e.g., these three examples are PSPACE-complete). Most of 
these problems have been investigated in the framework of 1-safe Petri nets, 
where they were natural questions since the beginning. An excellent survey 
is ||Esp98| . 

Temporal logic: Model-checking PLTL, CTL, or CTL* formulas on non-flat 
systems is PSPACE-complete | |KVW98] . Model-checking the branching-time 
mu-calculus is EXPTIME-complete, even when restricted to the alternation- 
free fragment IEab97bLlirMW^ . 

Behavioural equivalences and preorders: Trace equivalence of non-flat sys- 
tems is EXPSPACE-complete |Rab97aj while bisimilarity is EXPTIME- 
complete |,TM96] . as is simulation equivalence |HKV97j . 



Behavioural equivalences. This third set of problems is where the existing results 
are the most incomplete when assessing the state explosion problem. One of the 
difficulties here is that the linear time - branching time spectrum contains dozens 
of different semantical equivalences ICia90l (cf. Fig.[T|). 

However, some general methods apply to several equivalences at once: 

(1) [.TM9B] shows EXPTIME-completeness of seven truly concurrent variants of 
bisimulation. One single construction suffices for the lower bounds since all seven 
equivalences coincide in the absence of concurrency. 

(2) [R,ab97a| shows that all equivalences lying between trace equivalence and 
bisimilarity are PSPACE-hard. Note that this apply to all classical equivalences 
from |Gla90j and also to any new equivalence, however fancy, one would care to 
define B 

Rabinovich’s result is impressive, even more since it has been convincingly 
argued |01a,9fll IPnii851 IMil89j that any interesting equivalence lies between these 
two extremes. However, the result is not optimal since not one relation between 
trace equivalence and bisimilarity is known to be in PSPACE for non-flat sys- 
tems. Indeed, |R,ab97aj conjectures that all these equivalences are EXPTIME- 
hard. 



^ A similar approach appears in [,Tan95| where a single construction shows undecid- 
ability, over P/T nets, of all equivalences between trace equivalence and bisimilarity. 
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Our contribution. We partially answer 
Rabinovich’s conjecture. We prove EXP- 
TIME-hardness of all equivalences (actu- 
ally any relation) lying between the sim- 
ulation preorder and strong bisimilarity, 
and EXPSPACE-hardness of all equiva- 
lences (actually any relation) lying be- 
tween trace inclusion and ready trace 
equivalence or possible-futures equiva- 
lence. 

These results have several important 
corollaries. First, they close (on non- flat 
systems) the gap between lower-bound 
and upper-bound for the 11 relations van 
Glabbeek singles out as most fundamental 
in his linear time - branching time spec- 
trum. 

Secondly, they entail EXPTIME- 
hardness (over non-flat systems) of all 
model-checking problems for temporal or 
modal logics able to specify bisimilar- 
ity or simulation. For example, since the 
branching-time modal mu-calculus can 
state bisimilarity through a simple (modal 
depth 2) formula IAnd93l . EXPTIME- 
hardness of bisimilarity entails EXPTIME-hardness of model-checking mu- 
calculus formula over non-flat systems (a result already known from [KVW98] 
IR.ab97b| l. 

Finally, our technique is interesting in itself: our construction for the 
branching-time relations differs from the approach in |JM96| 0 It originates 
from our investigations of complexity questions for Timed Automata IAL99I 
and readily gives EXPTIME-hardness of all relations between (strong timed-) 
bisimilarity and simulation. To our knowledge this is the first complexity char- 
acterization of behavioural equivalences over these models. 

Plan of the paper. We first give basic definitions on (flat and non-flat) systems, 
the behavioural equivalences we need (§ m and alternating Turing machines 
(§11). We then prove our generic EXPTIME lower bound (§[il) and our generic 
EXPSPACE lower bound (§ (H). Upper bounds are given when they match the 
lower bounds. 

Acknowledgments. This work owes much to the comments and suggestions we 
got from L. Jategaonkar and the anonymous referees who saw an earlier version. 

^ Additionally, an incorrect labeling of the nets and the omission of some crucial part 
of the construction make the proof of Theo. 5.7. in [,1M96] hard to repair [,lat99j . 
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2 Equivalences and Preorders between Non-flat Systems 



Transition systems. A flat (transition) system is a tuple C = {S,Q, — >) where 
A is a finite alphabet, Q is a finite set of states, and — *-C Q x 17 x Q is the 
transition relation. The size ICI of C is |i7| + \Q\ + | — s-|. As usual, we write 
q — ^ q' when (g, a, q') G — >, and let ready{q) denote the set of actions ready in 
q, i.e. {a G S \ q — ^ q' for some q'}. 

Traces. A trace from q G Q is any w = ai . . . a„ G S* such that there exists 
go, 9i, • • ■ , S Q with q ^ qo and g^-i g* for z = 1, . . . , n (written go ^ 
qn). If = ready{qi) for z = 0, . . . , n, then (Aq, oi, Ai, 02 , ^ 2 , ... , o„, A„) is 
a ready trace from g. We write Tr{q) (resp. RT(q), PF{q)) for the set of traces 
(resp. ready traces, possible futures) from g (where (w,S) G S* x V{S*) is a 
possible future of p if there exists g s.t. p g and S = Tr{q)). Trace inclusion, 
trace equality, ready trace equivalence and possible-futures equivalence, denoted 
^Tr, =Tr, =RT and =pF, have the obvious definition. 



Bisimulations. A simulation over C is any R C Q x Q satisfying the following 
transfer property: for all qRq' and g — > r, there is a q' — *■ r' s.t. q' Rr' . A 
bisimulation is any symmetric simulation. The largest simulation over C exists, 
is denoted C, and is called the simulation preorder. The largest bisimulation is 
denoted g± and is called bisimilarity. 

The hierarchy of equivalences. |Gla90l IGla93J survey the main behavioural 
equivalences (and preorders) used in the semantics of concurrent systems. Van 
Glabbeek list dozens of different possibilities between the weakest (trace equiv- 
alence) and the strongest (bisimilarity). The most important stepping stones in 
this hierarchy are given in Fig. [TJ 

As usual, for any such behavioural relation TZ, we write (C, g) TZ (C', g') when 
qTZq' inside a disjoint sum system C + C . We write C TZ C when C and C 
come with (often implicit) initial states, and (C, go) TZ {C' , gp). 



Non-flat systems. A non-flat system is a product of flat systems. Formally, it is a 
vector S = (Ci, . . . , Ck) where Ci = {Si, Qi, — for i = 1, . . . , k. The flatten- 
ing C's of S is the transition system (A, Q, — >) given by A = AiU- • - U A^, Q = 
Qi X • • • X Qfc and where — s- is the set of all triples ((gi, . . . , qk), a, (ri, . . . , r^)) 
from Q X E X Q s.t. for z = 1, . . . ,k either qi — or (a ^ A^ and qi = ri). In 
this paper we only need binary synchronization, i.e. where a given a belongs to 
at most two different Ei. 

For a behavioural relation TZ, deciding whether STZS' means deciding whether 
Cs TZ Cs> . A naive algorithm for this problem is any algorithm that computes 
Cs undCs'. Let |5| = \Ci\ + ■ ■ ■ + \Ck\. Then \Cs\ is 0(|Ci| x ••• x \Ck\), hence 
(9(21*51). yjjjg jg known as state explosion. 
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Non-flat systems in the literature. |P,a,h97h| uses automata acting on boolean 
variables (or more generally on variables with a finite domain), |Esp98 IJM96] 
use 1-safe labeled Petri nets. Product of finite automata are sometimes called 
concurrent systems |KVW98l IRab97aj . When relabeling of actions is allowed, 
any of these models can be directly translated into any other For maximal 
generality, we prove our hardness results for products without relabeling (and 
our upper bounds through naive algorithms that easily handle relabelings) which 
is one more way we strengthen the results from |Rab97a| . 



3 Alternating Machines 

An Alternating Turing Machine |( iKS81 j (an ATM for short) is a tuple A = 
(Q, A, i5, go, qp) where Q = {g, . . . } is the set of states, E = {a, ..} is the tape 
alphabet containing a special blank symbol (denoted hyf}),SCQxExQx 
E X {L, R} is the set of transitions, go G Q is the initial state, qp G Q is the 
final (accepting) state and I : Q ^ {V,A} labels each state as either disjunctive 
or conjunctive. 

Q is thus partitioned by I into Qv and Qa- We use letters r,r' ,. . . to denote 
conjunctive states, s, s', . . . for disjunctive states and g, g', . . . for both. W.l.o.g. 
we require that go, qp G Qv, that E = {a, b}, that each q ^ qp is the source of a 
transition, and that an ATM has clean alternation, i.e. it moves from disjunctive 
to conjunctive states and vice versa. We assign to each transition in (5 a number 
k G {1, . . . , |(5|} and we will denote by tk the fc-th transition. 

A configuration of A (also called an instantaneous description, or an i.d.) is a 
triple a = (g, i, w) where q G Q is the current state, w G E* is a word describing 
the tape content, and 0 < * < |w| is the position of the head on the tape (i.e. A 
is currently seeing w{i)). We use letters /?,... to denote disjunctive i.d.’s (that 
is, i.d.’s with a disjunctive control state) and 7 , . . . for the conjunctive i.d.’s. An 
i.d. (g, i, w) is final iA q = qp. 

An ATM moves like an usual non-deterministic TM: if a = (g, i, w), w{i) = a 
and (q,a,q',b,D) G S, then A may move from a to a' = {q' ,i' ,w'), written 
a — > a' , where w' is w updated by writing a 6 in position i and is f -I- 1 if 
D = Rori — 1 a D = L. (As usual, if i' falls outside of w' , we pad w' with an 
extra <0 and perhaps readjust i' .) We say a' is a successor of a : there can only 
be a finite number of such successors. 

The moves of an ATM starting from some i.d. oq can be arranged into a 
tree: the root node is labeled with oq, and any node labeled by some a has one 
child for every a' s.t. a — > a'. The order of the branches is not relevant so that 
there is only one tree starting from a given ag. We call it the run of A from ag. 

® There exist other varieties of non-flat systems. Quite often they rely on a direct 
synchronization mechanism and can be accounted for in our formalism. The few 
exceptions (e.g., the Message Sequence Charts of |MPS98) or the Communicating 
Hierarchical State Machine of |AKY99j l have only recently been considered from a 
complexity-theoretic point of view, and they are obvious candidates for continuations 
of our work. 
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The run of A on some input word x is its run from a(x) = (go, 1, x). Note that 
a run may be infinite, and that a node is a leaf if and only if it is labeled by a 
configuration without any successor. 

For ATM’s, accepting runs are defined by seeing the run as an AND-OR tree. 
Formally, for n G N, we say a run rooted at some disjunctive P is accepting in n 
steps iff it is a final configuration or n > 1 and one of its children is accepting in 
n — I steps, while a run rooted at some conjunctive 0 7 is accepting in n steps 
iff n > 1 and all its children are accepting in n — 1 steps (and there is at least 
one child). We say A accepts a; in n steps iff the run from a(x) is accepting in 
n steps. A word x is accepted by A iff there exists n > 0 s.t. A accepts x in n 
steps. 

We say A is linearly-bounded on x if any configuration (q,w,i) in the run 
of A on some x has |w| < \x\ (that is, the machine never uses more tape than 
what is needed by the input). A classical result says that the problem LB-ATM- 
ACCEPT : 

input: an ATM A and a word x G S* s.t. A is linearly-bounded on x, 
output: yes iff A accepts x, no otherwise, 
is EXPTIME-complete. 

4 EXPTIME-Hard Relations 

Theorem 4.1. Any relation lying between the simulation preorder and bisimi- 
larity is EXPTIME-hard on non-flat systems. 

This is our main technical result and the rest of this section is devoted to the 
proof, a logspace reduction from LB- ATM ACCEPT. The proof of EXPTIME- 
hardness of bisimilarity in [.IMflfij is also based on a reduction from LB- ATM 
ACCEPT but, as mentioned in the introduction, the encoding is quite different. 

The proofs of the next two lemmas assume familiarity in handling simulations 
and bisimulations. 



4.1 Modeling au ATM by a Nou-flat System 

Let A, Wo be an ATM with a word of length n such that A = (Q, E, S, I, go, gp) is 
linearly-bounded on wq- We build a concurrent system Sa,wq = (B, Ci, . . . , C„) 
which models the run of A over wo- Each Ci models the f-th tape cell: it can 
be in state a or b, and its initial state is wo(i). The tape cell synchronizes with 
the head of the ATM, hence for each transition tk = (g,e,q' ,e' ,d) G 5, Ci has 

a transition e e' . See Fig. [2] for an example of Ci component for a set of 

transitions. 

B is the control part of A. Write Q~ = {q~ \ q & Q} (resp. Q+ = {( 7 + . . . }) 
for a set of copies of states from Q tagged by a ” (resp. by a “-I-”). The states 
of Sis Qb = Q~x{l,... ,n}UQf x{l, . . . ,n}UQj( x{l, ... ,n}x{ti, . . . 



4 



Remember qr ^ Qa. 
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t\ = (si, a, ri, b, R) 
t2 = {si,b, r2,b, R) 
*3 = {ri,a,S 2 ,a,L) 
ti = (ri,a,S 3 ,b,R) 
ts = {r 2 ,b, S 3 , a, R) 
U = (r 2 ,a,S 4 ,a,L) 




Fig. 2. System Ci 



A state {q ,i) oi B encodes a control state of A and a position of A’s head 
over the tape. For each = (q,e,q',e',d) G S and for any i, B has a transition 

{q~ ,i) — ^ W ) * + d) where i + d denotes i + 1 (resp. z — 1) if d = i? and i < n 
(resp. d = L and i > 1). 

These transitions are called “type 1” and they synchronize with the corre- 
sponding transitions from the Ci’s: a transition labeled “tk, i” is enabled in 
iff the current control state is q, the position of the head is i, and if Ci contains 
the right value. Firing this transition modifies the value of Ci, the control state 
and the head position so that the behaviour of A and its tape is faithfully emu- 
lated by the type 1 transitions. 




Kinds of transitions: ^ type 1 types 2,3 types 4,5 

Fig. 3. (Part of) system B 



An example of such type 1 transitions is displayed in the left part of Figdi 
assuming S as in the previous example. 

The q~^ states behave slightly differently. To begin with, disjunctive and con- 
junctive states are not dealt in the same way. Assume s G Qv is disjunctive. For 
each tk = (s, e, r, e' , d) and tk> = (r, /, s', f , d') in <5, for any i, B has a (type 2) 
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transition + d,tk'). These transitions correspond to “firing tk 

and picking tk> as next transition” . 

The transitions for conjunctive states are defined accordingly: if tk is some 
(r, e, s, e', d), then B has (type 3) transitions (r+,i,ffc) (s+,i + d). 

Here is the idea behind these transitions: assume an i.d. s, i, w is not accept- 
ing. A strategy for establishing this had an opponent firing any transition tk from 
s, and the defender may picking a tk' (one must exist) leading to a rejecting i.d., 
etc. In Sa,wq, this strategy is implemented by having s~ (the opponent) firing 
a type 1 transition tk, forcing s’*" to fire the type 2 tk transition that selects tk' 
for the next move. 

The (type 4) transitions move from s~ states to the part: formally B has 

all (r+,i',tfe/) s.t. (s"*",*) (r+, i', t^/). The purpose is to allow in 

s~ everything allowed in s’*". 

The (type 5) transitions allow firing any tk' from a (r~^,i,tk) where B has 
already commited to tk- Assume tk = {r, e, s, e' , d), then for all tk' yf tk starting 
from r, i.e. tk' is some {r, f , s' , f ,d'), B has one of the following transitions 
depending on the values of e and /: 

— (r~^,i,tk) {s'~ ,i + d') if e = /. For example, see transition “t4,2” from 

(rj*",2,t3) or transition “fa, 2” from (ri,2,t4) in Fig.[3l 

— (r~^,i,tk) (s'"^,i -I- d') if e yf /. For example, see transition “t6,2” from 

(rj,2,t5) or transition “fa, 2” from (r^,2,t6) in Fig. El 

Intuitively if tk and tk' are both enabled (according to the value of Cj), then the 
transition leads to the corresponding s~ state, otherwise it leads to the corre- 
sponding s’*" state. 

Finally, B has special (type 6) transitions in {qp,i) states: {qp,i) — ^ {q^,i)- 
These are the only transitions without synchronization, they do not exist from 
the q^ states and they distinguish between the Q~ and the Q+ parts of B. 

With this we have completed the description of B. The size of Sa,wq is 0{n x 
IQ I X |i5| ) and Sa,wq can be built using only four counters, that is in space 
ln(n) -I- In(IQI) -I- 2in(|i5|). 

4.2 Relating A on Wq and Sa, wo 

A configuration of Sa,wo has the form (p, ei, . . . ,e„) where p is a ;B state and 
S {a,b} is a Ci state. We write such a configuration as {p,w) where w G A" 
is given by w{i) = Ci for i = 1, .. . ,n. 

(p, w) is said to be disjunctive (resp. conjunctive) depending on whether p 
contains a disjunctive or conjunctive state of A. 

We now link i.d.’s of A and configurations of Sa,wo ■ Given an i.d. a = (g, i, w), 
a~ denotes the Sa,wo configuration {(q~,i),w). Given a disjunctive i.d. /? = 
{s,i,w), /?■'■ represents {{s'^,i),w). Given a conjunctive i.d. 7 = (r,i,w) and a 
transition tk whose source node is r, 7^ denotes {{r'^ ,i,tk),w). 
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Lemma 4.2. If A does not accept wq, then {{q^ .l).wn) ( (gd . 1) , wn) ■ 

Proof. Remember that . (resp. 7, . . . ) denote disjunctive (resp. conjunctive) 
i.d.’s. Consider the following relation R between the configurations of Sa,wo- 

R {(/?“,/?+) I fj is rejecting} U Id U 

{(7~i7fc") I 7 — P 7 rejecting} 

We can show that R is bisimulation: since A rejects wq, then 

(((g(}, 1), Wo), ((^o", 1), Wo)) G i?, and it remains to check that R has the transfer 
property in both directions (see Appendix E] for details). 

Lemma 4.3. If A accepts wq, then {{qq , 1) t wq) % {((?(}, 1), wo). 

Proof. By induction on the number of steps for accepting. See Appendix I bI 

Corollary 4.4. For any relation IZ s.t. ±± F IZ A does not accept wo iff 

{{QoA),wo)T^{{qoi^)^wo). 

which concludes the proof of Theorem 14.11 

4.3 Upper Bounds 

Theorem 10 is in a sense optimal since the lower bounds it exhibits are optimal 
for the relations singled out in Fig. [U 

Theorem 4.5. Bisimulation, 2-nested simulation, ready simulation, and simu- 
lation on non-flat systems are EXPTIME-complete. 

Proof (sketch). There only remain to show membership in EXPTIME. In all four 
cases this can be done by a reduction to model checking of a simple branching- 
time mu-calculus formula. Such a reduction expresses a relation 7Z via a mu- 
calculus formula ipn in a way s.t. CTZC iff {C, C') |= ipn where C is a variant of 
C where the actions have been relabeled to avoid conflicts with C . For bisimu- 
lation this is done in | And93| , and the same technique apply to the other equiva- 
lences. We then rely on EXPTIME-completeness of mu-calculus model-checking 
for non-flat systems |KVW98l IRab97b) . □ 

4.4 Extension to Timed Automata 

Timed Automata [lA I )94j can be seen as a special kind of non-flat systems. We 
denote by N the set of natural numbers and by M the set of non-negative real 
numbers. If Cl = {x,y, . . . } is a set of clocks, C{Cl) denotes clocks constraints 
over Cl, that is the set of formulas built using boolean connectives over atomic 
formulas of the form a;tximora; — yixim with x,y G Cl, to G N and ixiG {=, < 
,>,<,>}. A time assignment v for Cl is a function from Cl to TZ. We denote by 
TZ^^ the set of time assignments for Cl. For v G TZ^’' and d G TZ, v -\- d denotes 
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the time assignment which maps each clock x in Cl to the value v(x) + d. For 
Cl' C Cl, [Cl' ^ 0]u denotes the assignment for Cl which maps each clock in 
Cl' to the value 0 and agrees with v over Cl\CV . Given a condition g G C{Cl) 
and a time assignment v G we note v \= g when g holds for v. We define 
the timed automata (TA): 

Definition 4.6. A timed automaton TA over S is a tuple {N,rjQ,Cl, E) where 
N is a finite set of nodes, rjo G N is the initial node, Cl is a finite set of clocks, 
E C NxC{Cl)xEx 2*^* X N corresponds to the set of edges: e = {rj, g, a,r,rj') G E 
represents an edge from the node 77 to the node g' with action a, r denotes the set 
of clocks to be reset and g is the enabling condition (the guard) over the clocks 
ofTA. We use the notation g g' . 

A configuration of TA is a pair (g, v) where 77 is a node of TA and v a time as- 
signment for Cl. Informally, the system starts at node 770 with the assignment vg 
which maps all clocks to 0. The values of the clocks may increase synchronously 
with time. At any time, the automaton whose current node is g can change node 
by following an edge (77, g, a, r, g') G E provided the current values of the clocks 
satisfy g. With this transition the clocks in r get reset to 0. Let O denote the 
set of delay actions {e(d) | d G IR}. Formally the semantics of TA is defined as a 
labeled timed transition system: 

Definition 4.7. A labeled timed transition system over E is a tuple S = 
{S, So, — *■), where S is a set of states, sq is the initial state, — >C S x (AU0) x S 
is a transition relation. We require that for any s G S and d G TZ, there exists a 

unique state s'^ such that s s'^ and that {s'^Y = s'^^" . 

The labeled timed transition system associated with TA is {Sta,sq, — ^ta), 
where Sta is the set of configuration of TA, sg is the initial configuration (770, vg), 
and — >ta is the transition relation defined as follows: 

{g,v)-^{g' ,v') iff 3 {g, g,a,r,g') G E s.t. v \= 9 and 7;' = [r <— 0 ]t; 
{g,v)^^{g' ,v') iff 77 = 77' and v' = v + d 



The standard notion of bisi mulation (and simulation) can be naturally ex- 
tended to timed systems (Cer93 : A strong timed (bi)simulation between TA and 
TT is a (bi)simulation between the associated labeled timed transition systems. 



Theorem 4.8. Any relation lying between the simulation preorder and bisimi- 
larity is EXPTIME-hard on timed automata. 



Proof (sketch). Let A,wg be an ATM with a word of length n. We transform 
the automaton B defined in section O into a timed automaton TB in such a 
way that the clocks Cl of TT encode the tape content. This encoding is used 
in [AL99J . The transitions of TB use guards over the clocks to ensure a correct 
behavior, and reset operations are used to modify the tape content according to 
the performed transition. Therefore we obtain a single timed automaton instead 
of a parallel composition of finite automata. 
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TB uses 2n + 1 clocks: The clock t is used to en- 

sure a delay of length 1 between two transitions of TB. The clocks Xi and 
Ui encode the value of the Tth tape cell by the following convention: Ci = a 
(resp. Ci = b) iff Xi = yt (resp. xi < yi). Let ti be the ATM transition 



ti,k 



{q,e,q',e',d) : the transition (q,k) {q' ,k + d) we used in B is replaced 

by a transition (q,k) * -p d) where g is Xi = yi (resp. Xi < yi) if 

e = a (resp. e = 6), the reset set r is {t,Xi\ (resp. {t,Xi,yi}) if e' = 6 (resp. 
e' = a). The initialization of the tape with the input word wq can be encoded by 



t—l,SQ,ru_ 






adding the transitions init~ " 1224 ’'“ (g“, 1) and CC ““ where 

Two — {0 I = b}. The acc transition of B are kept in TB. 

Lemmas E21 and IQ] still hold for the initial configurations {init ,uq) and 
(init~^,vo) where uq and vq map any clock in Cl to 0. 



Remark 4-9. Note that bisimulation and simulation for timed automata are 
EXPTIME-complete since the model-checking problem for the timed g- 
calculus (which allows to express bisimilarity and similarity) is EXPTIME- 
complete |AL99| . 



5 EXPSPACE-Hard Relations 

Theorem 5.1. Any relation lying between traee inclusion and the intersection 
of ready trace equivalence and possible-futures equivalence is EXP SP ACE-hard 
on non-flat systems. 

Proof (sketch). We adapt the proof, from j,TM96] . that trace inclusion is EXP- 
SPACE-hard on non-flat systems. 

Their proof is a reduction from the problem of deciding whether the lan- 
guage defined by a regular expression with interleaving is E*, which is known 
to be EXPSPACE-complete [MS94] . Given any regular expression e built from 
{U, ., 11} with |e| = n, Jategaonkar and Meyer build a non-flat system Net{e) 

over the alphabet XU}!, s.t. Tr{Net{e)) is (the prefix-closure of) 
...P"afcP"VI ai---afe G L{e)}. 

Let =bt.pf be the equivalence defined as the intersection of =bt and =pf- 
We can modify the previous model in a simple way to obtain Net{e, n) with n > 
\e\ so that, for L{e) = E* iff Net{e,n) =rt.pf Net{E*,n) iff Net{E*,n) Ctt 
N et{e,n). This will entail the result. 

The main idea is to add a state end from which the enabled transitions 
are labeled by X U {1} and lead to end. From any state g, we add transitions 

q — > end. By this way we have that RT{Net{e,n)) is (the prefix-closure of) 

4n 

{( {!}, 1, ■ • ■ , {!}. 1, X, ai, {1}, 1, . . . , X, Ofe, {1}, 1, . . . , {V}, V, 0) I oi . . . flfc G L{e)} 

4n 

U {({l},l,.r ,{l},l',X,6i,{l},l,... ,X,6fe,{l}) I 6i,... ,6fc G X} 
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and PF{Net{e,n)) is the set of pairs {w,S) s.t. w = G 

Tr{Net{e,n)) and S is the prefix-closure of U S 2 or S 2 with: 

51 = {w' I w.w'^y G Tr{Net{e,n))} 

5 2 = {w' = l4"-fe5^l4n , , 5;l4n | 5^ . . . fo, g r* } 

Note that for (w,S) G PF{Net{E* ,n)), S is the prefix-closure of: 

{w' = \bi...bi&S*} 

Clearly, Net{S*,n) C^r Net{e,n) iff Net{e,n) =rt Net{S*,n) iff 
Net{e,n) =pf Net{E*,n) iff L(e) = S*. This gives the result. □ 



5.1 Upper Bounds 

Theorem lO is in a sense optimal since the lower bounds it exhibits are optimal 
for the relations singled out in Fig. [I] 

Theorem 5.2. Possible- futures equivalence, ready trace equivalence, failure 
trace equivalence, readiness equivalence, failures equivalence, completed trace 
equivalence and trace equivalence on non-flat systems are EXPSPACE-complete. 

Proof (sketch). We only need to prove membership in EXPSPACE. In all cases, 
this can be done by the naive algorithm, noting that the problems are in 
PSPACE for flat systems (by simple reductions to language equivalence of non- 
deterministic automata) . 



6 Conclusion 



We have shown that for non-flat systems, any relation between the simulation 
preorder and bisimilarity is EXPTIME-hard, and that any relation between trace 
inclusion and ready trace equivalence is EXPSPACE-hard. 

This is a partial answer to the questions raised by Rabinovich |Rab97al 
Indeed, these results cover a large array of relations, and they give lower bounds 
matching the (obvious) upper bounds in the 11 relations van Glabbeek singles 
out as most prominent in his branching time - linear time spectrum. 

For the EXPTIME-hard relations, our construction also applies to timed 
automata, where the lower bounds were not known. 

This theoretical study has practical implications. It strongly suggests that 
there is no way to escape state explosion when checking non-flat systems for some 
behavioural relation, at least not by some smart choice of which behavioural 
equivalence is chosen El Attempts at general solutions should rather aim at find- 
ing a smart limitation of how non-flat systems may be described. In such a 



® Additionally, our hardness results do not need his hide operator to further relabel 
the products of systems. 

® Since our results are not a complete answer, we cannot rule out the dim possibility 
that some PSPACE-easy relation exist in the branching time-linear time spectrum. 
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quest, one should aim at forbidding the construction of our Sa,wq system (or 
any reasonably succinct equivalent encoding). 

A related idea is to focus on the complexity of deciding whether 5 ~ C 
where C is a fixed system and where S is then the only input. For this mea- 
sure, called implementation complexity, the results are no longer uniform. For 
example, for simulation we have that deciding whether 5 C C is still EXPTIME- 
complete [HKV97j while for bisimulation, we have the following: 

Proposition 6.1. When C is fixed, deciding whether S±±C is PSPACE- 
complete. 

Proof (Sketch). PSPACE membership combines the ability to build a CTL for- 
mula d>c such that S±±C iff 5 ^ <Pc [BCG88j and the fact that CTL model 
checking of non-flat systems is PSPACE-complete |KVW98] . PSPACE-hardness 
is by reduction of the reachability problem in S. 
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A Proof of Lemma 14.21 

We have to show that R has the transfer property in both directions: 

1. Consider a pair (/3“,/3+) € R, with f} = {s,i,w) a rejecting (disjunctive) i.d. 
(3 is not a final configuration, no acc transition is enabled from [3~: and we just 
have to check the transfer property for transitions labeled by tk,i- 
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— Assume a type 1 move ( 3 ~ — ^ {{r ~ In A, this corresponds to 

(s,i,w) — Let 7 be Since j 3 is disjunctive and reject- 

ing, 7 is rejecting. Thus there exists a move 7 — >k' P' s.t. / 3 ' is rejecting. 
Moreover the (type 2 ) transition 7^ is allowed in Sa,wo- Therefore, 

for any transition P~ 7“, there exists 7^!i s.t. (7“, 7^) G R. 

— The other possible moves for this pair are type 4 transitions P~ — > and 

type 2 transitions — > 7^: they can be imitated by the other side, relying 
on G IdC i?. 

2 . Consider a pair G R with 7 = (r,i,w). 7 is a rejecting i.d. and 

there exists a move 7 — >k P leading to a rejecting P = (s, i' , w'). We check the 
transfer property: 

i / X 

— Assume a type 1 move 7“ — 4 [ 3 ~ . Then 7 — P and either k = k' or 

k k'.lfk = k' , then 7+ /3+ and since P is not accepting, (/ 3 “, / 3 +) G TZ. 

When k k' , both and tfc' are enabled from 7, so that tk and tk' require 

i / X 

the same letter on the tape cell, and there exists a type 5 move 7“*" — 4 j 3 ~ . 
We use the fact that {P~ ,P~) G Id C 7 ?.. 

— Assume a type 3 move 7^ / 3 “*" with P = {s'^ ,i' ,w'), it can be simulated 

by 7” P~ because P is not accepting, so that (/ 3 “,/ 3 +) G R. 

— Other moves from y^!" reach a P~ (because tk is enabled from 7) and can be 
easily imitated. 

3 . Finally, the pairs from Id obviously enjoy the transfer property. 

B Proof of Lemma 14 .HI 

We show by induction on I that 

1 . If /9 is accepting in I steps, then P~ % P~^ . 

2 . If 7 = (r, i, w) is accepting in I steps, then 7“ % 7^ for any k s.t. the source 
node of tk is r. 

— I = O'Af P accepts in 0 steps, then it is final and P~ cannot be matched 
from P~^ . A conjunctive configuration 7 cannot be accepting in 0 steps. 

— Assume the property holds for any V < 1 . We have two cases: 

• A disjunctive P accepts in / -|- 1 steps. Then there exists tk s.t. P — >k 7 

where 7 = {r,i,w) is accepting in I steps. In Sa,wq} the transition P~ — A 
7“ has to be matched by a transition labeled with “tk, *” which leads to 
a configuration 7^ and, by i.h., 7“ % 7^. 

• A conjunctive 7 = (r,i,w) accepts in ^ 1 steps. We must show that 

7“ % 7^ for any k s.t. tk starts from r. There are two cases: 

* tk is enabled from 7: since 7 accepts, any move from 7 leads to an 

i.d. accepting in I steps. In Sa,wo, the transition 7“ P~ can only 
be matched from 7^ by 7^ / 3 + and, by i.h., P~ % P~^ . 
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* tk is not enabled from 7: any transition tk> enabled from 7 leads to 
an i.d. accepting in I steps and one such a transition exists. In Sa,woj 

the move 7“ j 3 ~ can only be matched from 7^ by 7^ ( 3 '^ 

and, by i.h., ( 3 ~ % / 3 +. 

In both cases, we found a transition from 7“ which cannot be simulated 
from 7"’' and then 7“ ^ 7^. 

Now, since we assume that {q^, 1 , wq) is accepting, the proof is complete. 
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Abstract. A proof system for timed automata is presented, based on 
a CCS-style language for describing timed automata. It consists of the 
standard monoid laws for bisimulation and a set of inference rules. The 
judgments of the proof system are conditional equations of the form 
(f> {> t = u where (p is a. clock constraint and t, u are terms denoting 
timed automata. It is proved that the proof system is complete for timed 
bisimulation over the recursion-free subset of the language. The com- 
pleteness proof relies on the notion of symbolic timed bisimulation. The 
axiomatisation is also extended to handle an important variation of timed 
automata where each node is associated with an invariant constraint. 



1 Introduction 

Timed automata [AD94j has been recognised as a fundamental model for real 
time systems. By now the theory of timed automata has been well developed, 
but there is still one aspect missing: axiomatisation. 

Timed automata extend traditional finite automata with a finite set of real- 
valued clock variables (or clocks for short) by annotating each transition with, 
in addition to an action label, a clock constraint (enabling condition) and a 
subset of clocks (reset set). Intuitively a timed automaton may stay at a node 
with clocks increasing uniformly (to model time passage), or choose a transition 
whose clock constraint is satisfied, make the move, reset the subset of clocks 
associated with the transition to zero, and arrive at the target node of the 
transition (to model control state switch) . The explicit presence of clock variables 
and resetting, features that mainly associated with the so-called “imperative 
languages”, distinguishes timed automata from process calculi such as CCS, 
CSP and their timed extensions which are “applicative” in nature and therefore 
more amenable to axiomatisation. 

The aim of this paper is to propose a proof system for timed automata. We 
adapt the symbolic bisimulation technique originally developed for value-passing 
processes |HL95) IHL96J to the timed setting. We first present a simple CCS-style 

* Supported by research grants from National Science Foundation of China and Chi- 
nese Academy of Sciences 

J. Tiuryn (Ed.): FOSSACS 2000, LNCS 1784, pp. 208- 12^ 2000. 

(c) Springer- Verlag Berlin Heidelberg 2000 



A Proof System for Timed Automata 209 



language in which each term represents a timed automaton. The language has 
a conditional construct 0— (read “if (j) then t”) where 0 is a clock constraint. 
Action prefixing is of the form a(x).t, meaning to perform action a and reset the 
clocks in x to zero, then behave like t. An inference system is then formulated 
with judgments being conditional equations of the form 



4>\>t = u 



Intuitively it means that “t and u are timed bisimilar for each clock valuation 
satisfying . A typical inference rule takes the form: 



GUARD 



(j> A Ip \> t = u (p A -'■0 > 0 = u 
(p \> i'ip^t) = u 



It performs a case analysis on the constraint ip: ip^t behaves like t when ip is true, 
and like the inactive process 0 otherwise. Note that the guarding constraint ip of 
ip^t in the conclusion is part of the object language describing timed automata, 
while in the premise it is shifted to the condition part of the judgment in our 
meta language for reasoning about timed automata. 

The crucial rule, as might be expected, is the one for action prefixing: 



ACTION 



(pUyh \>t = U 

(p > a(x).t = a{y).u 



y n C{t) = X n C{u) = 0 



Here J,xy and jj- are postfixing operations on clock constraints. 0ixytl is a clock 
constraint obtained from (p by first setting the clocks in xy to zero (operator 
ixy)) then removing up-bounds on all clocks of (p (operator jj-). Readers familiar 
with Hoare Logic may notice some similarity between this rule and the rule 
dealing with assignment there: 



{P[e/x]} X := e {P} 



But here the operator J,xy is slightly more complicated than substitution with 
zero, because clocks are required to increase uniformly. We also need jj" to allow 
time to pass indefinitely. 

Traditionally axiomatisation for so-called “pure” process algebras are based 
on equational reasoning, i.e. “replacing equal for equal”. Since timed automata 
involve clock constrains and clock resetting, it is not surprising that pure equa- 
tional reasoning along is no longer adequate. The inference system proposed in 
this paper can be viewed as extending pure equational reasoning by formulating 
suitable rules for the specific constructs present in timed automata. It turns out 
that with this extension the standard monoid laws for bisimulation are sufficient 
for timed bisimulation, i.e. the proof system consisting of the set of inference 
rules and the four monoid laws are sound and complete for timed bisimulation. 
The proof of the completeness result relies on developing a theory of timed sym- 
bolic bisimulation which is a binary relation indexed by clock constraints. It 
captures the standard definition of timed bisimulation in the sense that t and 
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Fig. 1. A Timed Automaton. 



u are symbolically bisimilar over indexing constraint 4> if and only if they are 
timed bisimilar for any time valuation satisfying (j). 

In the remaining of this section we briefly discuss related work. The language 
for timed automata is presented in the next section, with a symbolic operational 
semantics which associates each term in the language to a timed automaton. 
Section [3] develops a theory of symbolic bisimulation for timed automata. The 
proof system is presented in Section 01 together with its completeness proof. 
Section 0 discusses how to extend the language to include invariants. The paper 
is concluded with Section where further research direction is also outlined. 

Related work The first process algebra for timed automata is proposed in 
|WPD94| as the very first input language for the UPPAAL tool. The only pre- 
vious attempt to axiomatizing timed automata we are aware is | |DAB96| . which 
develops a large set of sound axioms for timed bisimulation. However, no com- 
pleteness result is reported. 

On the other hand, most timed extensions of process algebras came with 
axiomatisation on various equivalence relations including bisimulation. Of par- 
ticular interest is | Bor96| which also adapts the symbolic bisimulation technique 
of |HL95||HL96| to a timed process language and proposed a symbolic style proof 
system. As noted by the author, the language considered in that paper is quite 
different from timed automata as it does not involve clock variables. 

2 A Language for Timed Automata 

The theory of timed automata was introduced in |AD94J and has since then 
established as a standard model for real time systems. We first give a brief review 
for the readers unfamiliar with timed automata and then present an algebraic 
language in which each term denotes a timed automaton. 



2.1 Timed Automata 

A timed automaton is a standard finite-state automaton extended with a finite 
collection of real- valued clocks. In a timed automaton each transition is labelled 
with a guard (a constraint over clocks), a synchronisation action, and a reset set 
(a subset of clocks to be reset). Intuitively, a timed automaton starts an execution 
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with all clocks initialised to zero. Clocks increase at the same rate while the 
automaton stays within a node. A transition can be taken if the clocks fulfill the 
guard. By taking the transition, all clocks in the clock reset are set to zero, while 
the others are unchanged. Thus transitions occur instantaneously. Semantically, 
a state of an automaton is a pair of a control node and a clock valuation, i.e. the 
current setting of the clocks. Transitions in the semantic interpretation are either 
labelled with a synchronisation action (if it is an instantaneous switch from the 
current node to another) or a positive real number i.e. a time delay (if the 
automaton stays within a node letting time pass). 

Consider the timed automaton of Figure [T] It has two control nodes Iq and 
h and two real- valued clocks x and y. A state of the automaton is of the form 
{I, {s, t)), where l is a control node, and s and t are non-negative reals giving the 
value of the two clocks x and y. Assuming that the automaton starts to operate 
in the state {Iq, (0, 0)), it may stay in node Iq for any amount of time, while the 
values of the clocks increase uniformly, at the same rate. Thus from the initial 
state, all states of the form {Iq, (t,t)) with t > 0 are reachable. However, only 
at the states {Iq, (t,t)), where t > 1, the edge from Iq to h is enabled. Addi- 
tionally, edges are labelled with synchronization actions and simple assignments 
reseting clocks. For instance, when following the edge from Iq to l\ the action a 
is performed to synchronize with the environment and the clock y is reset to 0, 
leading to states of the form (^i, (t, 0)), where t > 1. 

For the formal definition, we assume a finite set of alphabets A for synchro- 
nization actions and a finite set of real-valued variables C for clocks. We use a, b 
etc. to range over A and x, y etc. to range over C. Subsets of C will be denoted 
by X, y with elements Xi, Xj, . . . , yi, yj, .... We use B{C), ranged over by 4>, '0 
etc., to denote the set of conjunctive formulas of atomic constraints of the form: 
Xi m OT Xi — Xj N n where Xi, Xj G C, Ng {<, <, >, >}, and m, n are natural 
numbers. The elements of B{C) are called clock constraints. 

Definition 2.1. A timed automaton over actions A and clocks C is a tuple 
{N,Iq,E) where 

— N is a finite set of nodes, 

— Iq G N is the initial node, 

— E C N X B(C) X A X 2^" X N is the set of edges. 

When {I, g, a, r, V) G E, we write l V . 

We shall present the operational semantics for timed automata in terms of a 
process algebraic language in which each term denotes an automaton. 

Sometimes to describe progress properties, nodes of timed automata are as- 
sociated with invariants that control the amount of time an automaton can stay 
at a node. Such an extension will be discussed in Section [5l 



212 



Huimin Lin and Wang Yi 



DELAY 



tp 



t{p + d) 



ACTION 



GUARD 



{a{x.).t)p tp{K := 0} 



tp t'p' 
{4>^t)p ^ t'p' 



p h 0 



CHOICE 



tp t'p' 

(t + u)p t' p' 



REC 



[t[fiyiXt/X])p t'p' 
(fixY't)p — ^ t' p' 



Fig. 2. Standard Transitional Semantics 



2.2 The Language 

We preassume a set of process variables, ranged over by X, Y, Y, .... The 
language for timed automata over C can be given by the following BNF grammar: 

t ::= 0 I cj)^t I a(x).t | t + t \ X \ &x.Xt 

0 is the inactive process which can do nothing, except for allowing time to pass. 
0— read “if (p then t”, is the usual (one-armed) conditional construct. a(x).r 
is action prefixing. -|- is nondeterministic choice. 

A recursion fixAt binds X in t. This is the only binding operator in this 
language. It induces the notions of bound and free process variables as usual. 
Terms not containing free variables are closed. A recursion fixAt is guarded if 
every occurrence of A in t is within the scope of an action prefixing. 

The set of clock variables used in a term t is denoted C (t) . 

A clock valuation is a function from C to R-°, and we use p to range over 
clock valuations. The notations p{x := 0} and p + d are defined thus 

p{x 

{p + d){x) = p(x) + d for all x 

Given a clock valuation p : C ^ R-'^, a term can be interpreted according 
to rules in Figure where the symmetric rule for -|- has been omitted. The 
transitional semantics uses two types of transition relations: action transition 
and delay transition We call tp a process, where t is a term and p a 
valuation; we use p, g, ... to range over the set of processes. We also write p. for 
either an action or a delay (a real number). 

Definition 2.2. A symmetric relation R over processes is a timed bisimulation 
if {p, q) G R implies 

whenever p p' then q — ^ q' for some q' with {p' ,q') G R. 

We write p ^ q if {p, q) G R for some timed bisimulation R. 
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Action 






Choice 



t ^ t' 

h.a.r , 

t -\-u — > t 



Guard 



t t' 

(j)Aip,a,r , 

0—>t >■ t 



Rec 



t[RxXt/X] ^ t' 
fixAt t' 



Fig. 3. Symbolic Transitional Semantics 



The symbolic transitional semantics of this language is reported in Figure E] 
Again the symmetric rule for + has been omitted. According to the symbolic 
semantics, each guarded closed term of the language gives rise to a timed au- 
tomaton; On the other hand, it is not difficult to see that every timed automaton 
can be generated from a guarded closed term in the language. In the sequel we 
will use the phrases “timed automata” and “terms” interchangeably. 

The two versions of transitional semantics can be related as follows: 

Lemma 2.3. 1. If t f then tp t'p{x := 0} for any p \= 4>. 

2. If tp — ^ t' p' then there exist tp, x such that p \= p, p' = p{x := 0} and 
t^-^t'. 



3 Constraints and Symbolic Bisimulation 

This section is devoted to defining a symbolic version of timed bisimulation. 
To easy the presentation we shall fix two timed automata and symbolically, 
i.e. without evaluating clock constraints, compare them for bisimulation. To 
avoid clock variables of one automaton being reset by the other, we always 
assume the sets of clocks of the two timed automata under consideration are 
disjoint and write C for the union of the two clock sets. Let N be the largest 
natural number occurring in the constraints of the two automata. An atomic 
constraint over C with ceiling N has one of the two forms: a;Nmorx — yNn 
where x,y G C, Ns {<, <, >, >} and m,n < N are natural numbers. 

In the following, “atomic constraint” always means “atomic constraint over C 
with ceiling A^” . Note that given two timed automata there are only finite number 
of such atomic constraints. We shall use c to range over atomic constraints. 

A constraint, or zone, is a boolean combination of atomic constraints. A 
constraint p> is consistent if there is some p such that p \= (p. Let p and ip be two 
constraints. We write p ^ ip to mean p\= p implies p \= p for any p. Note that 
the relation is decidable. 

A region constraint, or region for short, p is a, consistent constraint containing 
only the following atomic conjuncts: 

— For each i G {1, ... ,n} either Xi = rrii or rrii < Xi < rrii + 1 or Xi > N; 

— For each pair of i,j G n}, i j, either Xi — irii = Xj — mj or 

Xi — rrii < Xj — nij or Xi — irii > xj — rrij . 
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where the in Xi — rrii of the second clause refers to the rrii related to Xi in 
the first clause. In words, rrii is the integral part of Xi and Xi — rrii its fractional 
part. 

Given a set of clock variables C and a ceiling N , the set of region constraints 
over C is finite, and is denoted TZC%. In the sequel, we will omit the sub- and 
super-scripts when they can be supplied by the context. 

Fact 1 Let (jr he a region constraint. If p \= (j) and p' \= (p then 

— For all i G {!,..., n}, if p{xi) < N then 'lp{xi)\ = 'lp'{xi)\ . 

- For any i,j S {1, . . . ,n}, z yf j, 

• = {p(xj)} iff{p'{x,)} = {p'ixj)} and 

• {P(a^*)} < ^ff{p'{x^)} < {p'{xj)}. 

where [xj and {a;} are the integral and fractional parts of x, respectively. 

That is, two valuations satisfying the same region constraint must agree on their 
integral parts as well as on the ordering of their fractional parts. Note that this 
is precisely the definition of region equivalence due to Alur and Dill [AD94| . 

The notion of a region constraint enjoy an important property: processes in 
the same region behave uniformly with respect to timed bisimulation ( |Ger92j ): 

Fact 2 Let t, u be two timed automata with disjoint sets of clock variables and 
(p a region constraint over the union of the two clock sets. Suppose that both p 
and p' satisfy p. Then tp ~ up iff tp' ^ up' . 



Fact 3 Suppose that p is a region constraint and p a zone. Then either p ^ p 
or p ^ -Ip. 

So a region is either entirely contained in a zone, or is completely outside a zone. 
In other words, regions are the finest polyhedra that can be described by our 
constraint language. 

A canonical constraint is a disjunction of regions. Given a constraint we can 
first transform it into disjunctive normal form, then decompose each disjunct 
into a disjoint set of regions. Both steps can be effectively implemented. As a 
corollary to Fact[3l if we write TZC{p) for the set of regions contained in the zone 
p, then Y TZC{p) = p, i.e. \J TZC{p) is the canonical form of p. 

We will need two operators to deal with resetting. The first one is |x where 
X C C C C. We first define it on regions, then generalise it to zones. By the 
abuse of notation, we will write c G p to mean c is a conjunct of p. 

For a region p, 

PU = P /\{xi = 0 \ Xi ex} A /\{xi = Xj \Xt,Xj ex] 

A /\{ Xi = Xj — m \ Xi e X, Xj ^ x,Xj = m G p} 

A /\{ Xi < Xj — m \ Xi G X, Xj ^ x,Xj > m G p} 
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and lx defined by 



tt 1 ;,= tt 

{c h (j)) i'^= (j) if xn/?;(c) 0 

(c A (/)) |x= c A ^ lx ifxn/w(c) = 0 

where fv{c) is the set of clock variables appearing in (atomic constraint) c. 

Lemma 3.1. 1. p\= cj) implies p{x := 0} \= (j>lx- 

2. If 4> is a region constraint then so is (/>|x • 

For a canonical constraint \J- (fa with each (j>i a region, (V^ = \J ■ 

For an arbitrary constraint (j), (|)|x is understood as the result of applying |x to 
the canonical form of (j). 

The second operator .'ll is defined similarly. We first define it on regions: 

H = H' f\ eij{4>) 

i=l3 

where is defined by 
tt-n' = tt 

{x < m A = X < N A (pfi' 

{x = m A = m < X A 

(c A = c A for other atomic constraint c 

and 

, ,, _( Xi — mi = Xj — mj Xi = mi, Xj = mj S (j) 

v W) 1 \ ^ — Xj < mi — mj otherwise 

For an arbitrary constraint <f>, is understood as the result of applying 'll to 
each disjunct of the canonical form of tp. 

Definition 3.2. p is P[-closed if and only if = p. 

Lemma 3.3. 1. p-^ is '^-closed. 

2. p \= p implies p |= ini'll . 

3. If p is p[-closed then p \= p implies p + d \= p for all d G R-'^. 

Symbolic bisimulation will be defined as a family of binary relations indexed 
by clock constraints. Following |( "erfi'ij we use constraints over the union of the 
(disjoint) clock sets of two timed automata as indices. Given a constraint p, 
a finite set of constraints ^ is called a p-partition if \J <P — p. A ^partition 
(p is called finer than another such partition S' if ^ can be obtained from W 
by decomposing some of its elements. By the corollary to Fact [31 TZC{p) is a p- 
partition, and is the finest such partition. In particular, if (|) is a region constraint 
then {p} is the only partition of p. 
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Definition 3.4. A constraint indexed family of symmetric relations over terms 
S = { I (/) closed} is a timed symbolic bisimulation if {t,u) G S‘^ implies 

whenever t ifigji there is a (f> A ip-partition ^ such that for each 

d)' G there is u yj for some ib' v and u' such that S' S' and 
{t',u') G 

We write t u if (t,u) G G S for some symbolic bisimulation S. 

It is easy to see that the S ^ '(/'“Partition <P used in the above definition can be 
replaced by any partition finer than 

Timed symbolic bisimulation captures ~ in the following sense: 

Theorem 3.5. t u iff tp ^ up for any p \= S- 

Proof. (=>) Assume (t,u) G S'^ G S for some symbolic bisimulation S. Define 
R = { {tp, up) I there exists some S such that p \= S and {t, u) G G S} 



We show i? is a timed bisimulation. Suppose ftp, up) G R, i.e. there is some </> 
such that p\= S and {t, u) G S'^. 

— tp — ^ t' p' . By Lemma 123] there are Si x such that p\= S-, p' = p{x := 0} 

and t tSSf p . So there is a (/) A -i/j-partition (p with the properties spec- 
ified in Definition 13.41 Since p ^ (/> A (/>, p ^ (/)' for some S' G Let 

u u' be the symbolic transition associated with this S' i as guaran- 
teed by Definition 13.41 Then S' f’' and ft',u') G S'^ Since p |= (/>', 

up u' p{y := 0}. By Lemma [XU p{xy := 0} |= ^'ixy By Lemma IXX 

p{xy := 0} 1= Therefore (Tp{xy := 0},zi'p{xy := 0}) G R. Since 

Tp{xy := 0} = t' p{k := 0} and u' p{x.y := 0} = u' p{y := 0}, this is the 
same as {f p{x. := 0},u'p{y 0}) G R. 

— tp t{p + d). Then also up u{p + d). Since S is fi-closed, p + d \= S- 
Therefore (t(p -I- d),u{p + d)) G R. 

(<;=) Assume tp ~ up for any p \= So, show t u as follows. For each 
fi-closed S define 

S'^ = { {t, u) I SS' G RC{S) 3p h </>' s.t. ftp, up) gr} 

and let S = { | is fi — closed} Then by Fact |2] ft,u) G S'^° . S is well- 

defined because i? is a timed bisimulation. We show S is a symbolic bisimulation. 

Suppose ft, u) G S'^ and let t p . Define (P' = {S' \ S' ^ TZC{S) and S' ^ 4’}- 

Then <?' is a A ■i/j-partition. For each S' G , there exists p s.t. p \= S' with 
(tp,up) G R. By the definition of , p ^ '0. By Lemma l2/3l tp t'pfx := 0}. 
Since (tp,up) G R, up u' p' for some u' and p' with ft' p{x. := 0},it'p') G R. 

By Lemma 12.31 again, u y' for some S' ^ind y with p \= S' ^tnd p' = 

p{y := 0}. Hence (tp{x := 0},Mp{y := 0}) G R, which is the same as (tpjxy := 
Ojj'upjxy := 0}) G R. ^From p |= S', by Lemma IXTl we have p{xy := 0} |= 
S' i^y Since S' is u region constraint, so is S' which is the only element of 
nC{S' Uy)- Therefore {f ,u') G 
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51 X + O^X 

52 A + A = A 

53 A + y = y + A 

54 {X + Y) + Z = X + {Y + Z) 

Fig. 4. The Equational Axioms 

4 The Proof System 

The proof system consists of a set of equational axioms in Figure [Hand a set of 
inference rules in Figured The judgments of the inference system are conditional 
equations of the form 

(j)\> t = u 

with (j) a constraint and t, u terms. Its intended meaning is “tp ~ up for any 
p\= (j)'\ t = u abbreviates ttt> t = u. 

The equational axioms are the standard monoid laws for bisimulation |Mil89| . 
The set of inference rules extends equational reasoning by introducing a rule 
for each construct in the process language. CONGR-+ expresses the fact that 
bisimulation is preserved by +. The rule GUARD permits a case analysis on 
conditional. It is all we need to reason with this construct. AGTION is the in- 
troduction rule for action prefixing. This rule is complicated by the fact that an 
action has associated with it a clock resetting, hence necessitates the two oper- 
ators Ixy and fi. It requires a side condition to make sure clock resetting in one 
process does not interfere with the other. Finally, the two rules PARTITION and 
ABSURD have nothing to do with any specific constructs in the language. They 
are so-called “structural rules” used to “glue” pieces of derivations together. 

Let us write \- (f> \> t = u to mean (j) \> t = u can be derived from this proof 
system. 

Some useful properties of the proof system are summarised in the following 
proposition: 

Proposition 4.1. 1. h (p^{'tp^t) = (p A ip^t 

2. A t = t + (p^t 

3. If (p ^ Ip then \- (p\> t = tp^t 

4- \-(pAtp\>t = u implies \~ (p\> fj—^t = ip^u 
5. h <^— + u) = (p^t + (p^u 

The rule PARTITION has a more general form: 

Proposition 4.2. Suppose <P is a finite set of constraints and \J(I = (p.If\- 
ip \> t = u for each ip £ <1>, then \- <p \> t = u. 

Soundness of the proof system is stated below: 

Theorem 4.3. If\-cpl>t = u and (p is f\-closed then t u. 

Now we discuss the completeness of the proof system, and we shall confine to 
the recursion- free subset of the language. As usual the completeness proof uses 
the notion of a normal form. 
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EQUIV 



AXIOM 



CONGR-+ 



GUARD 



ACTION 



PARTITION 



ABSURD 



t — U t = U U = V 



t = t u = t 



t = V 



t = M is an axiom instance 



t = u 

t = t' 

t + u = t' + u 

(j) f\ %l) \> t = U (f) t\ -<1p > 0 = M 



> l> ijj^t = u 



<?i'J.xy'(^ >t = U 

(j) l> a(x).t = a{y).u 

(f>l > t = U (f>2> t = U 
d>[> t = u 



y n C(t) = X n C(u) 
(p ^ (f)! \/ 4>2 



Fig. 5. The Inference Rules 



Definition 4.4. A term t is a normal form ift = <pi^ai(yii).ti and each ti 
is a normal form. 

Definition 4.5. The height of a term t, denoted | t |, zs defined thus: 

- |O| = 0 

- \t + u\= max{ \ t I, I u 1} 

- I I = I t I 

- I a(x).t I = 1 + I t I 

Lemma 4.6. For every term t there exists a normal form t' such that | t | = | | 

and \~ t = t' . 

Theorem 4.7. For recursion-free terms t and u, t u implies \~ (p \> t = u. 
Proof. By Lemma |4. 6 1 we assume t, u are in normal form: 

t = y^(pi^a^{x^).ti 
iGl 

Without loss of generality, we may assume = bj = a for all i and j. 
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Apply induction on the joint height of t and u. The base case is trivial. For 

the induction step, let (j)' G TZC{(j)). For each i G I, t Since t u, 

there exists a (f) A i^i-partition with the properties specified in Definition 13.41 
Without lose of generality, we assume each element of ^ is a region constraint, 
i.e. <1> = TZC{(j) A 4>i). Since (j)' is a region, by Fact [3] there are two cases: 

(case 1) (j)' A (pi = ff, i.e. <p' ^ By GUARD and ABSURD we can derive 
h 0' > cpi^a{x^).ti = 0. 

(case 2) cp' => (pi, i.e. p' G <P. By the definition of symbolic bisimulation, there 

is some j G J such that p' ^ pj, u u- and U Uj. By induction 

we have 

P P ix^yj 1*1 U = Uj 



By ACTION, 



\- p' \> aijxp.U = a{yj).Uj 



Since p' => pi and p' => pj, by Proposition 14. 11 



\- p' \> pi^a{yii).U = pj^a{yj).Uj 



Symmetrically, for each j G J, either \- p' t> pj^a{yj).Uj = 0 or there is some 
i € I such that 

\- p' > pj^a(yj).Uj = pi^a(yii).ti 
Therefore, using SI - S4 and CONGR-+, we can conclude 



\- p' \> t 



'^pi^a^{yi^).ti + ^Pj^bj{yj).Uj 
iei jeJ 



and 



Hence 



\- p' \> u 



Pz^ai{yiP.U + ^j-^bj{yj).Uj 
iei j&J 



\- p' \> t = u 



Finally an application of Proposition |32] gives the required 



\- p \> t = u 



5 Invariants 

One important variation on the notion of timed automata is to associate an 
invariant condition to each node of the automaton to model progress behaviours. 
According to the transitional semantics of Figure Ea process can delay forever at 
any location (node). To disallow such arbitrary delays each location in a timed 
automata is assigned an invariant constraint, with the interpretation that delay 
transitions at a node will not be possible when the invariant at the node is 
violated. 
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To describe timed automata with invariants we extend our language as follows 
s ::= {^}t 

t ::= 0 I (/)— I a(x).s | t + t \ X \ fixXt 

In this language, invariants can not occur at places which do not correspond to 
locations in timed automata. For instance, strings having the forms 

+ {tp}u or RxX{(j)}t are not terms of the language, while + u) and 
0— >a(x).{'0}t are allowed. 

We assign each term t an invariant constraint Inv(t) by letting 

_ , . (6 if t has the form 

^”^(^) = |tt otherwise 

Furthermore, we add a side condition to the rule delay in Figure |2] plus a new 
rule iNV to deal with invariants: 



DELAY 2 ; p+ d' \= Inv(t) for any 0 < d' < d 

tp — > t[p + d) 



INV 



tp fp' 

mt)p ^ t'p' 



p h 



For the symbolic transitional semantics, we simply forget the invariants (recall 
that symbolic transitions correspond to edges of automata, while invariants re- 
side in nodes): 



Inv 



t' 



A slightly modified version of Lemma (2.31 holds : 

Lemma 5.1. 1. Ift t' then tp t' p{x := 0} for any p\= 4> f\ Inv{t). 

2. If tp — ^ i’ p' then there exist (p, x such that p \= (f> A Inv{t), p' = p{x := 0} 

and t t' . 



Definition l2.2l of timed bisimulation remains the same, since the intended ef- 
fects of invariants have already been manifested in the the transition rules for the 
standard transitional semantics. On the other hand, the definition of symbolic 
timed bisimulation, Definition El should be modified slightly to accommodate 
invariants: 



Definition 5.2. A constraint indexed family of symmetric relations over terms 
S = { I 0 P\— closed} is a timed symbolic bisimulation if (t,u) G S'^ implies 

1. 4> ^ {Inv(t) => Inv{u)) and 

2. whenever t t' then there is a Inv(t) A 4> A ip -partition <P such that for 



each (p' G <1> there is u u' for some ip' , y and u' such that tp' ip' and 
{t',u') G 



We write t u if (t,u) G G S for some symbolic bisimulation S. 
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We have the counterpart of Theorem I d.5t 
Theorem 5.3. t u iff tp ^ up for any p\= (f- 

The proof of this theorem is very similar to that of Theorem 13.51 with the uses 
of Lemma 1231 replaced by Lemma EH 

Concerning the proof system, we add a rule to deal with the construct 

Ip A \> t = u <p A -itp \> {fF}0 = u 
<p \> {ip}t = u 

This rule appears similar to the GUARD rule. However, there is a crucial dif- 
ference: When the guard ^p is false behaves like 0, the process which is 
inactive but can allow time to pass; On the other hand, when the invariant is 
false {ip}t behaves like {ff}0, the process usually referred to as time-stop^ which 
is not only inactive but also “still”, can not even let time elapse. 

With these modifications the completeness result carries over to the new 
setting: 

Theorem 5.4. For recursion-free terms t and u in the extended language, t 
u implies \- (p \> t = u. 

The proof uses the following normal form taking invariants into account: 

{'^p}'^cp^^ai{xi).U 

iei 

The technical details of the proof are almost the same as that of Theorem 14.71 

6 Conclusion 

We have proposed a theory of symbolic bisimulation and presented a proof sys- 
tem for timed automata. Using conditional equations as judgments the proof sys- 
tem separates manipulation of time from reasoning about process equivalence. 
As a result the proof system is much simpler than purely equational formulation. 
It is shown that by generalising pure equational reasoning to a set of inference 
rules dealing with specific language constructs needed for timed automata, the 
standard monoid laws for bisimulation are sufficient for characterizing bisimu- 
lation in the timed world. This result agrees with the previous works on proof 
systems for value-passing processes |HL96J and for 7r-calculus [Lin94j , providing 
a further evidence that the four monoid laws capture the essence of bisimulation. 

The proof system presented in the current paper is complete only over finite 
timed automata, i.e. the subset of timed automata which do not involve loops. We 
conjecture that by adding a suitable version of unique fixpoint induction [Mil84| . 
together with the standard laws for folding/unfolding recursions, a complete 
proof system for the whole set of timed automata can be achieved. A similar 
result has been reported in |A,T94J for regular timed CCS [Wan91j . We leave this 
as a topic for future research. 
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Abstract. This paper describes the categorical semantics of a system 
of mixed intuitionistic and linear type theory (ILT). ILT was proposed 
by G. Plotkin and also independently by P. Wadler. The logic associ- 
ated with ILT is obtained as a combination of intuitionistic logic with 
intuitionistic linear logic, and can be embedded in Barber and Plotkin’s 
Dual Intuitionistic Linear Logic (DILL). However, unlike DILL, the logic 
for ILT lacks an explicit modality ! that translates intuitionistic proofs 
into linear ones. So while the semantics of DILL can be given in terms 
of monoidal adjunctions between symmetric monoidal closed categories 
and cartesian closed categories, the semantics of ILT is better presented 
via fibrations. These interpret double contexts, which cannot be reduced 
to linear ones. In order to interpret the intuitionistic and linear iden- 
tity axioms acting on the same type we need fibrations satisfying the 
comprehension axiom. 



1 Introduction 

This paper arises from the need to fill a gap in the conceptual development 
of the xSLAM project. The xSLAM project is concerned with the design and 
implementation of abstract machines based on linear logic. For xSLAM we ini- 
tially developed a linear A-calculus by adding explicit substitutions to Barber 
and Plotkin’s DILL [GdPROO] . We then considered the categorical models one 
obtains for both intuitionistic and linear logic with explicit substitutions on the 
style of Abadi et al. [GdPR 99] . 

The DILL system [BP97| distinguishes between intuitionistic and linear vari- 
ables: linear variables are used once during evaluation, intuitionistic ones arbi- 
trarily often. This is a key feature for the optimisation which linear logic provides 
for the implementation of functional programming languages. But in DILL the 
intuitionistic implication is defined in terms of linear implication and the modal- 
ity ! via the standard Girard translation, namely A — > 5 = {\A)—oB. This is not 
appropriate for implementations of functional languages. The reason is that in 
the translation of the simply-typed A-calculus into DILL !’s occur only in types 

* Research supported by EPSRC-grant GR/L28296 under the title “The eXplicit Sub- 
stitution Linear Abstract Machine”. 

J. Tiuryn (Ed.): FOSSACS 2000, LNCS 1784, pp. 223- 12771 2000. 
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\A—oB, and the linearity is in effect not used. Indeed, a function of this type is 
applied only to arguments with no free linear variables, and during the execu- 
tion of the program these arguments will be substituted only for intuitionistic 
variables. Finally we want to detect immediately when a function is intuitionis- 
tic. Hence it is more appropriate to have both — > and — o as primitive operations 
and disregard ! . This leads to consideration of the mixed intuitionistic and linear 
type theory (henceforth named ILT) described by Plotkin | Plo93| and Wadler 
IWadQOI obtained from DILL by (i) adding intuitionistic implication, and (ii) 
removing the modality ! from the type operators. 

The syntactic behaviour of ILT is very similar to that of DILL. But when it 
comes to semantics, the situation is a little more complicated. It is not obvious 
how to restrict the idea of a symmetric monoidal adjunction, so that we capture 
all the behaviour of intuitionistic implication, without at the same time, import- 
ing all the machinery for modelling the modality !. But if we step back and look 
at our models for calculi of explicit substitution, we can see that modelling intu- 
itionistic logic using fibrations can be combined with modelling (intuitionistic) 
linear logic using symmetric monoidal closed categories, and in a way that does 
not bring in all the machinery for !. 

The expert reader will note that the fibration modelling of intuitionistic logic 
is only necessary for dealing with predicates and/or dependent types; and this 
paper is only concerned with propositional intuitionistic logic. However, fibration 
modelling does provide a means of adding linear type theory to intuitionistic type 
theory in the required way. This is the main result we establish in this paper. 

The paper is organised as follows. In the first section we describe the calculus 
ILT. In the next section we define IL-indexed categories and prove soundness 
and completeness of ILT with respect to them. In the third section we show that 
ILT is the internal language of (a suitable restriction of) IL-indexed categories. 
Finally in the fourth section we add exponentials to these IL-indexed categories 
and we prove the equivalence between them and the models given by a symmetric 
monoidal adjunction between a symmetric monoidal closed category with finite 
products and a cartesian closed category that is the co-Kleisli category with 
respect to the comonad induced by the adjunction. 



2 Intuitionistic and Linear Type Theory 

The system of mixed intuitionistic and linear logic that we model in this paper, 
to be called Intuitionistic and Linear Type Theory or ILT for short, borrows 
from Girard’s Logic of Unity the elegant idea of separating assumptions into two 
classes: intuitionistic, which can be freely duplicated (shared) or discarded (ig- 
nored); and linear, which are constrained to be used exactly once. Syntactically, 
this strict separation is achieved by maintaining judgements with double-sided 
(“dual”) contexts F \ A\- A, where, as a convention, F and A contain non-linear 
(intuitionistic) and linear assumptions, respectively. Another distinguishing fea- 
ture of ILT is that it has both intuitionistic (A ^ B) and linear implications 
{A—oB), as well as additive (A&H) and multiplicative {A^B) conjunctions with 
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their units (1 and /), but no modality (or exponential) types \A. This system 
should not be confused with BI the logic of bunched implications proposed by 
O’Hearn and Pym [OP99J . whose propositional fragment has the same operators, 
but with very different behaviour. 

The system ILT closest relative is Barber and Plotkin’s DILL |BP97| and 
most of its syntactic properties can be easily derived from DILL’s properties. 
But semantics is a different story: DILL’s rather elegant semantics in terms of 
a monoidal adjunction between a symmetric monoidal closed category L and 
a cartesian closed category C is not suitable for ILT, as ILT has no terms (or 
morphisms) corresponding to the modality per se. For instance, ILT has no term 
corresponding to id: \A ^lA. This section describes briefly the system ILT. 

The set of types we shall work with is 

A-.:=G\ A^B \A^B\A®B\I\ AkB \ T 
The syntax of preterms is defined inductively by 

M,N ■.■=a\x\ Xa^.M \ Xx^.M \ M ,N \ MiN 

\ M ^ N \ let M be a ^ b ±n N \ (M, N) \ Fst(M) | Snd(M) 

I o I • I let M be • in 

where a and x range over countable sets of linear and intuitionistic variables, 
respectively. This distinction of variables is not strictly necessary, but we adopt 
it here to aid legibility. Because the two let-expressions behave so similarly we 
sometimes write let M be p in iV to cover both, where p is either a (E> b or •. 
The typing rules for ILT are standard, see Table [H 

We have three kinds of equations, (3 and 77 -equations and commuting conver- 
sions. The last kind of equations, familiar in the setting of linear lambda-calculi, 
arise due to the form of 77 -rules for the tensor product and its unit. For the pre- 
sentation of these equations we use contexts-with-holes, written C[_]. They are 
given by the grammar 

C[_] ::= _ I Xa^.Cl] \ Xx^.Cl] \ C[_],M | | C[_]zM | MzC[_] 

I C[X\^M I M(g)C[_] I let C[_] be p in iV | let M be p in C[_] 

Note that this definition implies that there is exactly one occurrence of the 
symbol _ in a context-with-hole C[_]. The term C\M] denotes the replacement 
of _ in C[_] by M with the possible capture of free variables. This capture is the 
difference between the replacement of _ and substitution for a free variable: If 
C[_] is the context-with-hole (Aa^._), then C[a\ = Xa^.a, whereas {Xa^.b)[a/b] = 
Xc^.a. The equations for ILT are given in Table El 

Note that in ILT linear variables can move across the divisor of the context 
as expressed in the following lemma. 

Lemma 1 For every ILT derivable judgement F \ a\ \ A\, ... ,an ■ S \~ M : 
B we can derive F, x\ : Hi , . . . ,Xn : H„ | S h M[xi/ai, . . . a;„/a„] : B. 
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3 Categorical Semantics of ILT 

The basis for our categorical model of ILT is Ehrhard’s notion of a D-category 
for modelling dependent types [Ehr88j , which goes back to Lawvere’s idea of hy- 
perdoctrines satisfying the comprehension axiom [Law70| . Hyperdoctrines model 
many-sorted predicative logic, where predicates are indexed over sorts or sets. A 
suitable adjunction allows the interpretation of the comprehension axiom, that 
is the creation of a subset defined by a predicate indexed over a set. Ehrhard 
generalized this idea in terms of fibrations introducing D-categories to interpret 
the Calculus of Constructions |Ehr88| . Here we adopt D-categories to model 
ILT. The fact of no having type dependencies will be clearly expressed by some 
restrictions that we will put on the particular D-categories we use to prove that 
ILT is their internal language. 

In order to make more explicit the structure we need to interpret our calculus 
we recall the definition of D-categories in terms of indexed categories, which are 
categorically equivalent to fibrations. A D-category is a split indexed category 
where the base category B models contexts and the fibre over an 
object r models terms whose free variables are contained in the context modelled 
in r . We require both B and each fibre E(r), for E in B, to have a terminal object 
T. We also require that for every / morphism in the base category E(f) preserves 
the terminal object. The fibration associated to this indexed category is the 
projecting functor p : Gr{E)^B, where Gr{E) is the Grothendieck completion 
(see also page 107 of | Jac99| L We recall that the objects of the Grothendieck 
completion of E are the couples (T, A) where E is an object of B and A is an 
object of E{E). The morphisms of Gr{E) between {E, A) and (A, C) are couples 
(/, h) where f : E ^ A is a, morphism in B and h : A ^ E{f){C) is a morphism 
in E{E). For every object E in B the category E{E) is said the fibre of p under 
the object E. 

The key construction of a D-category to interpret contexts and substitutions 
is the requirement called the “comprehension property” i.e. the requirement that 
the terminal object functor T: B^Gr{E) has got a right adjoint G: Gr{E)^B. 
Recall that the functor T is defined as follows: for every object T in the base 
category B, T{E) = (T, T) and for every morphism /, T(/) = (/, Id). Actually 
T is an embedding functor of the base category B into the fibres of E. The 
right adjoint to T assures that every object, which for example interprets a 
sequent T h A in the fibre over the object interpreting the context E, can be 
put in correspondence to a context, in the example E, A, in the base category. 
Moreover by T a morphism in the fibre corresponds to a morphism in the base 
category and this allows to model substitution by the re-indexing functor. 

The idea for the model of ILT is to modify this setting to capture the separa- 
tion between intuitionistic and linear variables in ILT (with their corresponding 
substitutions) and simultaneously to model the two identity axioms, i.e. the as- 
sumptions of intuitionistic and linear variables, acting on the same types. The 

^ Note that from now on when we refer to indexed categories we mean split indexed 
categories, i.e. the pseudofunctor towards Cat is actually a functor. 
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base category B models only the intuitionistic contexts of ILT, z.e., objects in B 
model contexts {T \ _). Each fibre over an object in B modelling a context F \ _ 
models terms F \ A\~ M-. A for any context A. We require a terminal object 
in B. The fibres are now symmetric monoidal closed categories with finite prod- 
ucts (SMCP categories) and model the linear constructions of ILT. The functors 
between the fibres have to preserve the SMCP structure. 

Since we no longer require each fibre to have a terminal object, we replace 
the right adjoint to the terminal object functor T by a right adjoint G to the 
unit functor U: B^Gr{E), assigning to each object F in B the object {F,I). 
This right-adjoint G is the comprehension functor. In this way we obtain that 
morphisms in the base correspond to morphisms with domain / in the fibre, i.e., 
terms with no free linear variables. 

Now we can model substitution for intuitionistic variables by reindexing along 
morphisms in the base as usual: this adjunction U ~\ G enforces the restriction 
that only terms with no free linear variables can be substituted for intuitionistic 
variables. Intuitionistic function spaces are modelled in the standard way by the 
right adjoint to weakening. 

Definition 2 Let B he a category with a terminal object T. An IL-indexed cat- 
egory is a functor E: ,8°^— >Cat such that the following conditions are satisfied. 
(Note that we write / * (— ) for the application of the functor E to f, for any 
morphism f in B.) 

(i) E{F) is a symmetric monoidal category with finite products, i.e. a SMCP 
category, for each object F of B. Moreover for each morphism f in B, the 
functor f* preserves this SMCP structure on the nose, i.e. it is a SMCP 
functor. 

For every object F in B, we denote the terminal object of E{F) by T, the 
unique map towards T from every object C of E{F) by terc, the product of 
two objects A and B by Ax B, the projections by tti and tt 2 and the unique 
map from A to B x C given two maps t and s from A to B and A to C 
respectively by < t, s > . 

(a) For each object F of B the functor U\B^Gr{E), given by U{F) = (F,I) 
and U{f) = (/, Id) has a right adjoint G: Gr{E)^B. The object G{F,A) is 
abbreviated F.A in the sequel and the morphism G{f, h) is written f.h. 
Furthermore (Fst, Snd): (T.A, 7)^(T, A) denotes the counit of this adjunc- 
tion. The natural isomorphism between Horri( 5 ,r(E)((~j ^)) ^)) 

Home)— ,— .A) is denoted by 

(Hi) For every object F of B and A of E{F), the functor fst\: E{F)^E{F.A) 
has a right adjoint II a- E{F.A)^E{F). We will write in the sequel Cur^ for 
the natural isomorphism between Aom.E{r.A)(J'St*A{B),C) andJiom.E(r){B, 
IIa{C)) and Appj for its counit. 

(iv) The Beck-Chevalley- condition for the adjunctions Fst^ h IIa is satisfied in 
the strict sense, i.e. the equation /*(Cur^(t)) = Cur)|^((/.ld)*(t)) holds for 
every f: A^F, A G E{F), B G e[f.A). 

Next, we define the interpretation of the ILT-calculus, which is the minimal 
ILT-theory corresponding to the notion of IL-indexed category. 
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Definition 3 Given any IL-indexed category E: ,8°^— >Cat we define a map [[_] 
from types to objects in E(T), from intuitionistic contexts E to objects in B, 
from linear contexts A to objects in E(T), from double contexts to objects of 
suitable fibres and from terms E \ A\- M\ A to morphisms |M]]: [[Z\]]— >[[A]] in 
i?(|T]]) by induction over the structure: 

(i) On intuitionistic and linear contexts respectively: 

y = T iE,x:Aj = iEUAj y = / iAa:y = yyyi 

where I is the tensor-unit in the category E(T) and also [[Z\] and yj are 
objects of E(T). 

On double contexts: [[T | Z\] = FstJj^j (|Z\]]) because |Z\]] being a linear 
context is an object of E(T). 

(ii) On types: 



IA^B} = TTi^j . IB} lA^Bj = ^ ^ iBj 

y = I lAkBj = yi X IB} 



= yy yi 

111= T 



(Hi) On terms (assuming that E = x\: Ai, . . . ,x„: An): 



{E,x: A I _ h x: A] = Snd 
y h M: y = t 
y, x:B\- M: y = Fst * t 



lE\a-.Ah a-.A} = Id 
y I Z\ h o : IJ = teriAj 



lE,x:A\ Z\hM: B}=t 
y I Ah Xx^.M:A^Bj = Cufi{t) 



IE \ Ah M:A^B}=t y|_h7V:y = s 
y I Z\ h MiN: B} = (Id, s) * (App^ • t) 



y I Z\ h M: y = t y I Z\ h = s 
IE \ Ah (M,N):Ax B} =<t,s> 

lE \ Ah M:Ax B}=t lE \ Ah M: A x Bj = t 

y I Z\h Fst(M):y =7ri(t) |r I ZihSnd(M):y = 7T2(t) 

y I z\i h M: y = t y I z\2 h iV: y = s 

y I Z\ h M®N : A (h) B} = (t (h) s) • n 

y I Z\i h M : A X y = m jE \ A, a: A,b: B h N: C} = n 

y I Z\ h let M be a 0 6 in CJ = n • (Id 0 m) ■ tt 



y I 0 h . : 71 = Id 



y I Zii h M: J]| = TO y I Z\2 F A^: y = n 

y I Z\ h let M he • in (7] = n y • (Id ® to) • tt 



where ip is one part of the isomorphism between y 2 I and y 2 I ® I 
\E\ A,a-.Ah M-.B}=t 
y I Z\ h Xa^.M-.A^B} = Cur^(t) 

where Cur^ is the corresponding natural transformation for the adjunction 
between (— )(g)|y and |y^(— ) in A(yj) 
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|r I Z\i h M: A^B\ =t ir \ A 2 h N-.A} = s 
[r I Z\ h MlN: B} = App^ • (t(g)s) • 7T 

where ApPl is the counit of the right adjoint to tensor and tt is the canonical 
morphism from |Z\]] to |Z\i](g)[[Z\ 2 l • 

Next, we turn to the soundness of this categorical semantics. As always, the 
key lemmata concern substitution. In particular they are needed to prove the 
validity of introduction and elimination rules regarding intuitionistic implication 
and of all the conversion rules involving substitution. As we have two kinds of 
substitution, we have to show two substitution lemmata, namely for substitution 
of intuitionistic and linear variables. 

Lemma 4 (i) Assume [A', x: A \ A \- M : BJ = t and [[A | _ h A^: AJ = s. Then 

[[A I Ah M[N/x]:Bj = (ld,s)*t. 

(ii) Assume [[A | Z\i,a:A h M-.BJ = t and |A | A 2 h A^: A] = s. Then 
[[A I Z\ h M[A^/a]: B^ = t ■ (ld(g)s) • tt, where it is the canonical morphism 
from lAj to |Z\i(g)Z\ 2 ]. 

Proof. Induction over the structure of M. 

The soundness proof is now routine. 

Theorem 5 Given an IL- indexed category E: B°p Cat under the above in- 
terpretation |] the following facts hold. 

(i) Assume T \ A \- M: A. Then |T | Z\ h M: AJ is a morphism from |Z\]] to 
lAl m ifaCl); 

(ii) Assume T \ Ah M = N:A. Then {T \ A h M: A} = {T \ A h N: A}. 

Now we turn to the completeness theorem. 

Theorem 6 If {T \ A h M: AJ = [[T | A h A^: AJ where \j] is the above 
defined interpretation, for every IL- indexed category E: — > Cat and for 

every derived sequents E \ Ah M: A and T | Z\ h A^: A then we can derive in 
ILT r \ Ah M = N-.A. 

Proof. As usual the proof is based on the construction of a term model out of 
ILT. Since the interpretation of ILT in the syntactic model turns out to be the 
identity then the completeness immediately follows. 

First recall that in order to prove that two functors U : B ^ Gr{E) and 
G : Gr{E) B define a right adjunction [/ H G, we give two data: firstly, 
a natural transformation ao '. Hom([/(— ), Hom(— , G(G)) for each ob- 
ject D in Gr{E), and secondly the co-unit, that is a natural transformation 
e : U ■ G— >1 such that for every object G in ,8 and every / in Hom([/(G), D) we 
have CD ■U{aD{f)) = f. 

Now we proceed by defining the syntactic IL-indexed category starting from 
an ILT-theory, based on the ILT-calculus and possibly some ground types with 
the corresponding terms. 
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Definition 7 Given an ILT-theory T with any set of ground types Q we define 
the syntactic IL- indexed category F(T) in the following way. 

The base category: 

— Objects of the base category B{T) are lists of types {A\, . . . , An). The termi- 
nal object is the empty context [ ] . 

— Morphisms from {Ai, . . . , An) to {Bi, . . . , Bm) are lists of terms (Mi, . . . , 

Mm) such that a;i: , x„: A„ \ _ l~ Mp. Bi for some intuitionistic vari- 

ables x\, . ■ ■ ,Xn- We will write A for {Ai, . . . , An) whenever convenient. 

— Two morphisms (Mi, . . . , Mm) and {Ni, . . . , Nm) from (^i, . . . , An) to 
{Bi,...,Bm) supposing that xp. A\, . ■ ■ ,Xn'. An \ _ l~ Mp Bi and yp.Ai, 
...,yn'.An I - 1“ Np.Bi are equal if we derive xi: Ai, . ■ ■ , Xn'. An l~ Mi = 
Ni[x/y] \ Bi. We will write (M) for (Mi, . . . , Mm) whenever convenient. 

— The identity morphism on {A) is the list (x); 

— Composition is given by intuitionistic substitution: given morphisms 
(Ml,..., 

Mim) from (j4i , . . . , An) to (Lli , . . . , Bm) w%th Xi . j4i , . . . , Xn. An I - Mii . Bi 
and (TVi, . . . , Nn) from (Ci, . . . , Cu) to A such that yi: C\, . ■ ■ ,yk- Ck \ - l~ 
Nj-.Aj, we define M ■ N to be {Mi[N /x ], . . . , Mm[N /x\). 



The fibres: 

— The objects of the fibres of E{ A) are types A. 

— A morphism from A to B in E{A) is a term M such that xi: Ai, . ■ ■ , Xn'. An \ 
a: A h M:B. Two morphisms M and N from A to B in E{A) such that 
xp. Ai, . ■ ■ ,Xn'. An \ a: A \- M:B and yp A\, . . . ,yn'- An \b\A\~N\B are 
equal if we derive x: A \ a: A\~ M = N[x/y, a/b]: B. 

— For any morphism M from A to B, the functor E{M) is the identity on 
the objects and transforms any morphism M with y.B \ a\ A \- M\ B to 
M[M/y]. 

The structure in a fibre is given in the following. 

— The tensor product of two objects A and B in the fibre E{A) is the type 
A®B. The tensor product of two morphisms M and N in E{A) is the 
term let a®b be z in M®N if xp. A\, ... ,Xn'. An \ a\ A \- M:B and 
yp.Ai,..., Vn- An\b:A\-N: B. 

— The unit of the category E{A) is given by the type I. 

— The product and terminal object in E{A) are given by the products and the 
type 1 in the syntax in the standard way. 

— The right adjoint to the tensor product in E{A) is given by the natural trans- 
formation mapping the morphism M from C®A to B to Xa: A.M[c^a/b] 
where x\ A\b\ C®A h M: B; the co-unit is the natural transformation whose 
component at the object B is given by the morphism let a^b be c in ab with 
x: A \ c: A^B®A h let be c in ab: B. 
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The comprehension property: 

— The right adjoint G to the functor U is given hy 

^(((^li ■ ■ ■ ^)) = (^Ij • ■ ■ j 

if x: A I a\ A h M:B, since by lemma [J\ we can derive x:A,x:A \ h 
M[x/a]:B. For any morphism ((Mi, . . . M„), M) with x: A \ a: I \- M:B 
the natural isomorphism [M,AI] is (Mi, . . . , M„, M[u/a\). The co-unit for 
the object ((^i, • . . , An), A) is the morphism ((a;i, . . . , Xn), let * be a in a;). 

The intuitionistic function space: 

— The right adjoint to Fst*: _B(A)— >i?((A, A)) is the functor T[a{—)'-E{{A, 
A))^E{A) which maps every object C of E((A, A)) to A ^ C and ev- 
ery morphism M in E((A,A)) to Xx^.M. The natural transformation Cur^ 
maps the morphism M from C to B in E{A,A) to Xx: A.M if x\A,x:A \ 
_ h M: B; the co-unit is the natural transformation whose component at the 
object B is given by the term ax where x\ A, x\ A \ a\ A ^ B ax: B. 

Note the subtle difference in the definition of the base category and the fibre: 
we define objects in the base category to be lists of types, whereas objects in the 
fibre are singleton types. Having products in the calculus, we could have chosen 
a uniform definition and defined the objects of B to be types rather than lists of 
types. However, this means we would need to use projections in the syntax to 
access the components of the product, which is rather cumbersome. In contrast 
we have no choice for the definition of the fibres but to use types as objects. The 
reason is that with the other choice there is no way of turning the fibre into a 
symmetric monoidal closed category, as there is no way of defining C—oA®B in 
terms of C—oA and C^B. This is not problem for a cartesian closed category, 
as in this case we have C ^ A x B = (C ^ A) x {C ^ B). 

The key part of the completeness theorem is the following proposition, whose 
proof is a routine verification: 

Proposition 8 For any ILT-theory T the syntactic IL-indexed category F(ff) 
is an IL-indexed category. 

The syntactic IL-indexed category allow us to prove completeness for ILT with 
respect to IL-indexed categories. 

4 ILT as an Internal Language 

Starting from the above soundness and completeness theorems we want to see 
if ILT is actually an internal language of IL-indexed categories. To this purpose 
we define the following categories TH(JLT) and IL-ind. 
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Definition 9 The objects of TH(ILT) are the ILT-theories, i.e. type theories 
whose inference rules include the ILT ones. The morphisms are translations that 
send types to types so as to preserve I, T, 0, &, They send terms to terms 

so as to preserve the introduction and elimination constructors corresponding to 
the above types and they send intuitionistic (linear) variables to intuitionistic 
(linear) variables respecting their typability such that the typability judgement 
and equality between terms are preserved. 



Definition 10 The objects of the category IL-ind are IL-indexed categories and 
the morphisms between E: Cat and E': Cat are given by a functor 

E[ \ B ^ B' preserving the terminal object and a natural transformation a : 
E => E' ■ H such that for every object A in B ■ E{A) E'{H{A)) is 
a SMCP-functor. Finally the comprehension adjunction is preserved and the 
intuitionistic function spaces too as expressed by the conditions described in the 
following (where we differentiate the structure of E from that one of E' with the 
prime). 

1. For every object A in B and A in E{A), and for every morphism (f,t) ■ 
{A, A) (r,C) in Gr{E) we have H{G{A,A)) = G'{H{A),aA{A)) and 
H{G{f,t)) = G'{H{f),aA{t)). 

2. For every (f,t) : {A, I) ^ {F,C) H{[f,t]) = [(H(/), a^(t))]' 

3. For every object A in B, A in E{A) and G in E(A.A) and every mor- 
phism f in E{A.A) we have that oiA{nA{G)) = T[a^(A){ocA.A{C)) and 
«zi(Cur^(/)) = Cur^'(a^,A(/)). 

Formally the fact that ILT is the internal language of our IL-indexed cate- 
gories is proved by providing an equivalence between the category of ILT-theories 
TH(/LT) and that one of IL-indexed categories IL-ind. But we can prove the 
above equivalence only if we put some restrictions on the IL-indexed categories. 

Definition 11 An IL-indexed category E: B°^ Cat, is a restricted IL-indexed 
category if the following conditions hold: 

1. for every object A G Ob{B), meaning with la : A ^ T the unique map 
towards the terminal object T in B, the functor E{la) '. E{T) E{A) is 
bijective on the objects; 

2. the right adjoint G restricted to the fibres of T corresponding to E(J) is 
bijective on the objects; 

Finally we call rIL-ind the full subcategory of IL-ind whose object are restricted 
IL-indexed categories. 

Now we are ready to prove the following: 

Proposition 12 There exist two functors L : TH(ILT) — > rIL-ind and F : 
rIL-ind TH{ILT) that give rise to an equivalence between TH(ILT) and 

rIL-ind. 
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Proof. Given an ILT-theory T we define F(ff) in an analogous way to the defi- 
nition of the syntactic ILT-category in Definition |3 but we take B{T) to be the 
category whose objects are the ILT-types and whose morphisms between the 
types A and C are x : A\\- c: C. Hence we define G{{A, B)) = A&lB. We can 
easily see that F{T) is a restricted IL-indexed category. 

We can obviously lift any translation to become a morphism between IL-indexed 
categories. Given an IL-indexed category E\ B°p Cat we now define an ILT- 
theory L{E) out of it in the following way: 

Definition 13 The language of L{E) is defined as follows: 

1. the types of L{E) are the objects of the fibre E(T); 

2. the preterms of L[E) are the morphisms of E{A) for every object A of B; 

3. The inference rules are defined as the interpretation function in the Defini- 
<zon[3 Note that two typed terms represented by two morphisms in the same 
fibre are equal if they are equal as morphisms. 

The functor L can be easily extended on the morphisms of IL-indexed categories 
to define translations. What remains to be checked is that the two compositions 
L ■ E and E ■ L are naturally isomorphic to the corresponding identity functors. 
For every ILT-theory T it is easy to check that L{F{T)) can be translated into 
T via a natural isomorphism. 

For every restricted IL-indexed category E: B°^ — > Cat we prove that 
F{L{E)) is equivalent to E by the added requirements on IL-indexed categories. 
The base category B is equivalent to the B{E{L{E))) since by the comprehension 
adjunction with respect to E together with the first requirement we can build 
a faithful, full and surjective functor from B{F{L{E))) towards B. The natural 
transformation on each fibre is given by the projecting functors on the objects 
and by the identity on morphisms. The components of this natural transfor- 
mations are really isomorphisms by the second requirement on the restricted 
indexed category. 

Note that the internal language could be naturally enriched with explicit sub- 
stitutions on terms to represent the composition in the fibre by explicit substitu- 
tions of linear variables and the morphism assignment of E by explicit substitu- 
tions of intuitionistic variables. But if we want to interpret explicit substitution 
operations on contexts in a different way from those on terms, then we need to 
add another fibration to each SMGP fibre in the style of |GdPR99j . passing to 
a complicated doubly indexed category. 

Moreover, observe that every categorical model defined by Benton j Hen fib] 
given by a symmetric monoidal adjunction E h K with E \ C ^ S, C & cartesian 
closed category, S a symmetric monoidal closed category with finite products, 
provides an IL-indexed category, by taking as a base the cartesian closed category 
C and as the fibre over an object C of the base the symmetric monoidal closed 
category with finite products whose objects are those of S but whose morphism 
from Ato B are the 5-morphism E{C)®A^B. The intuitionistic space between 
A and B is given by the usual F{A)—oB. 
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Then, if the adjunction F \- K satisfies the requirement that C is the co-Kleisli 
category with respect to the comonad induced by the adjunction and K the em- 
bedding functor via the counit, the above IL-indexed category is also restricted. 

5 The Connection to the Exponentials 

In this section we show how to regain exponentials. We characterise exponentials 
by a universal construction, namely as the left adjunction to the functor which 
replaces all linear variables in a term by the intuitionistic ones. 

Note that since ILT is the internal language of a restricted IL-indexed cate- 
gory, its base category is actually cartesian closed so we can give the following 
definition to get exponentials: 

Definition 14 An rIL-indexed category with exponentials is a restricted 
IL-indexed category E: B°P^Cat such that the functor 2: E(T)^B given by 
T{A) = T .A, T(t) = id.t has a symmetric monoidal left adjoint to form a sym- 
metric monoidal adjunction. We write ! for the left adjoint. 

Note that X is a monoidal functor by using the internal language. It is 
also possible to define the exponentials by the condition Hom£;(p)(!* 2 l, i?) = 
HovnE(r.A){I,B) plus a Beck-Chevalley-condition [HS99j . It is easy to see that 
these two definitions are equivalent: if you specialise the second condition to the 
case X = T and use the adjunction between B and Gr{E) putting !(T.^) =!*(A), 
you obtain the first condition by the first requirement of restricted indexed cat- 
egories. The converse argument goes as follows: 

HomE(r){'.*A,B) ^ Horri£;(r)(J, HomB(X, T.(!M^X)) 

= Hom£;(T)(!X,!M^X) Hom£(T)(!X(g)!*A, X) ^ HomE(T)(!(r x (T.^)),X) 

^ HomB{F.A,T.B) ^ HomE(r-A){I, B) 

where the second-but-last equivalence uses the fact that the adjunction between 
E{T) and B is monoidal. 

It is instructive to examine the relation between a rIL-indexed category and 
certain Benton’s linear-nonlinear categories as expressed in the following. 

Definition 15 The category Ben^ has as objects Benton’s models F \- K 
\Ben95il . i.e. a symmetric monoidal adjunction between a cartesian closed cate- 
gory C and a symmetric monoidal closed category with finite products S where 
F \ C ^ S, such that C is the co-Kleisli category with respect to the comonad 
induced by the adjunction and K is the embedding functor via the counit. The 
morphisms between F -\ K and F' h K' , with F \ C ^ S and F' \ C' ^ S' , 
are functors M : S ^ S' preserving the SMCP structure and the symmetric 
monoidal comonad. 

We recall that the category Ben of Benton’s models and couples of functors 
(ill, 1 / 2 ) with Hi : S ^ S', H 2 : C ^ C commuting with the adjoints and 
preserving the monoidal adjunction has Ben^ as a co-reflective subcategory: 
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Berir 



Ben 



assigning to each Benton’s model the corresponding one given by the monoidal 
adjunction with the co-Kleisli category. Morever note that the category Berir is 
equivalent to the category Bier of Bierman’s models (see | Bie94p with functors 
preserving the relevant structure. 

Definition 16 The category rIL-indE has restricted IL-indexed categories as 
objects and as morphisms IL-morphisms preserving the adjunction that define 
exponentials, i.e. a morphism between E:B°^ — > Cat and — > Cat is a 

rIL-ind morphism given by H \ B ^ B' and a : E ^ E' ■ H such that it also 
satisfies the following conditions: 

— for every object A in B, IH{A) = a-j{^A); 

— for every object A in B and A in E(T) and for every morphism t \ A —> 
I (A) in B, 4>'{E[{t)) — ax {(jit))} where 4>' and 4> are the bijections of the 
corresponding adjunctions. 



Proposition 17 The category Benr 
to the category rIL — indE. 



of suitable Benton’s models is equivalent 



Proof. We already saw how every linear-nonlinear category in Benr gives rise to 
an rIL-indexed category in section[4[ The exponentials in Benton’s setting satisfy 
the universal property for exponentials in a rIL-indexed category with exponen- 
tials. Conversely, any rIL-indexed category with exponentials E : B^Cat gives 
rise to a linear-nonlinear category: the symmetric monoidal closed category is 
E{T), and the cartesian closed category is the base category B, which we prove 
to be cartesian closed by means of its internal language ILT. Now we can ob- 
serve by the internal language that the adjunction ! h / gives rise to a symmetric 
monoidal adjunction between E(jT) and B. Using the equivalence between the 
two definitions of exponentials given above one shows that these functors define 
an equivalence. 

By the above proposition we conclude that we can embed the category Ben 
into the category IL-ind through the reflection into Ben^ as an alternative to 
the embedding into IL-ind obtained by taking the cartesian closed category of 
a Benton’s model as the base category of the indexed category. 



Bie 



->■ Berir 



Ben 



rIL— indE 



R 

rIL-ind ^ 



IL— ind 



Here we could also prove that rIL-indE is a reflective subcategory of rIL-ind, 
whose reflection is given by freely adding the ! modality to the internal language 
of a rIL-indexed category and then considering the syntactic category associated. 
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In the context of Benton’s model, once the SMCP category S is fixed one is 
free to choose a functor F : C — > 5 to represent exponentials. In the context of 
rIL-indexed categories with exponentials the choice of F is determined by the 
choice of the indexed category E, that is the substitution along intuitionistic 
variables. 

6 Conclusion 

We have produced a sound and complete model for the type theory ILT. More- 
over we showed that, with a suitable restriction, IL-categories are the internal 
language for this type theory. The reasons for developing ILT are of a prag- 
matic nature: in applications within linear functional programming, it seems a 
good idea to have both intuitionistic and linear implication co-existing, instead 
of having intuitionistic implication a derived operation, obtained from Girard’s 
translation. 

We hope to find a good representation in terms of one-dimensional categories 
for IL-indexed categories. Maybe in order to achieve this we need to extend ILT 
with a connective reflecting the logical role of the operation “|” acting on ILT- 
contexts. 
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Abstract. We give a divergence- free encoding of polyadic Local tt into 
its monadic variant. Local tt is a sub-calculus of asynchronous 7r-calculus 
where the recipients of a channel are local to the process that has created 
the channel. We prove the encoding fully-abstract with respect to barbed 
congruence. This implies that in Local tt (i) polyadicity does not add 
extra expressive power, and (ii) when studying the theory of polyadic 
Local 7T we can focus on the simpler monadic variant. Then, we show 
how the idea of our encoding can be adapted to name-passing calculi 
with non-binding input prefix, such as Chi, Fusion and ttF calculi. 



1 Introduction 

Local TT, in short Ltt, is a variant of the asynchronous 7r-calculus |lll where 
the recipients of a channel are local to the process that has created the channel. 
More precisely, in a process (t'a) P all possible inputs at a appear - and are 
syntactically visible - in P; no further inputs may be created, inside or outside 
P. The locality property of channels is achieved by imposing that only the out- 
put capability of names may be transmitted, i.e., the recipient of a name may 
only use it in output actions. Ltt is a very expressive fragment of asynchronous 
TT-calculus, and its theory has been studied in [15]; similar calculi are discussed, 
or at least mentioned, in [Ilil[Il|30|. Ltt borrows ideas from some experimental 
programming languages (or proposals of programming languages), most notably 
Piet [2D1, Join |H], and Blue jS], and can be regarded as a basis for them (the 
restriction on output capabilities is not explicit in Piet, but, as we understand 
from the Piet users, most Piet programs obey it). The locality property makes 
Ltt particularly suitable for giving the semantics to, and reasoning about, con- 
current or distributed object-oriented languages HD. For instance, the locality 
property can guarantee unique identity of objects - a fundamental feature of 
objects. 

As for most name-passing calculi, the theoretical developments on Ltt have 
been conducted on a monadic calculus, that is, a calculus in which only single 
names can be transmitted. On the other hand, most applications in name-passing 
calculi use polyadic communications, i.e., communications involving tuples of 
names. So, an interesting issue is to investigate whether monadic and polyadic 
name-passing calculi have the same expressive power. In this paper we show that, 
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under the locality hypothesis on channels, monadic and polyadic 7r-calculi have 
the same expressive power. More precisely, we give an encoding d-]) of polyadic 
Ltt into monadic Ltt, and we prove it fully- abstract with respect to barbed congru- 
ence |18| . Our encoding is divergence- free, that is, it does not introduce infinite 
internal computations. Furthermore, we show how the idea of our encoding can 
be easily adapted to name-passing calculi with non-binding input prefix, such 
as Chi calculus 0, Fusion calculus jl9] and ttF- calculus CO], and we propose a 
simple encoding of polyadicity for these calculi. 

The first attempt of encoding polyadicity in name-passing calculi is by Robin 
Milner 1161 . Milner gives a simple encoding of polyadic into monadic synchronous 
TT-calculus. Milner’s encoding is not fully-abstract. In order to recover the full 
abstraction Yoshida and Quaglia and Walker have introduced two 
different type systems for monadic processes which model the communication 
protocol underlying Milner’s encoding. A different approach has been followed 
by Gonthier and Fournet in the Join-calculus [H], an “extended subset” of the 
asynchronous 7r-calculus. In [Bj, among other results, a direct, although complex, 
fully-abstract encoding of polyadic processes into monadic ones is proposed. All 
these approaches will be discussed at the end of the paper. 

In this extended abstract proofs are just sketched; complete proofs can be 
found in m- 

Outline The paper is structured as follows. In Section|^we describe the polyadic 
Ltt calculus giving some properties of it; in Section [3] we recall a few correctness 
criteria for encodings; in Section we present the encoding of polyadic Ltt into 
monadic Ltt; in Section El we prove the full abstraction of the encoding; in Sec- 
tion El we investigate other possible encodings of polyadicity in Ltt; in Section 0 
we show how the idea of our encoding can be adapted in name-passing cal- 
culi with non-binding input prefix; in Section |H] we conclude and discuss related 
works. 

2 The Polyadic Ltt 

Polyadic Ltt, in short Ltt, is an asynchronous fragment of Milner’s polyadic 
TT-calculus m- We use small letters o,6, c, ... ,x,y for names] capital letters 
P, Q, R for processes; and a to denote a tuple of names oi, . . . , a„. Ltt has op- 
erators of inaction, input prefix, asynchronous output, parallel composition, re- 
striction and replicated input: 

P ::= 0 I a{x).P \ d{b) | P|P | {ua) P \ \a{x).P 

where in input processes a(x). P names in x are all distinct and may not occur 
free in P in input position. This syntactic constraint ensures that only the output 
capability of names may be transmitted. 

We use (7 for substitutions; Per is the result of applying a to P, with the 
usual renaming convention to avoid captures; {^/a} is the simultaneous substi- 
tution of names a with names b. Parallel composition has the lowest precedence 
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among the operators, and Hn is an abbreviation for the process Pi \ ... | 

We write (t'a) P for (uai) . . . {van) P and ab for a{b). The labeled transition 
system is the usual one (in the late style liz])- Struetural eongruence, written = 
and defined as usual (see [16]), allows us to ignore certain structural differences 
between processes. Transitions are of tlw form P — ^ P' , whe£e aetiori^ ^ can 
be: T (interaction); a{b) (input); {vc) d{b) (output) where c C b and d{b) is an 
abbreviation for {vb) d{b). In these actions, a is the subject and b the object. We 
write — ^ to mean P-^Q, if /r yf r, and either P = Q or P^^Q, if /r = r. 
Relation is the reflexive and transitive closure of — moreover, stands 
for and for if /r yf r, and for if /r = r. Free and bound 

names (fn, bn) of actions and processes are defined as usual. 

We assume Milner’s sorting system, under which all processes are well- 
sorted m- Names are partitioned into a collection of sorts. A sorting function 
is defined which maps sorts onto sequences of sorts. If a sort S is mapped onto 
a sequence of sorts T this means that channels in S can only carry tuples in T. 
A sorting system is necessary to prevent arity mismatching in communications, 
like in d{b,c) \ a{x).P. Substitutions must map names onto names of the same 
sort. 

The behavioral equivalence we are interested in is barbed congruence |18| . It is 
well-known that barbed congruence represents a uniform mechanism for defining 
a behavioral equivalence in any process calculus possessing (i) an interaction 
relation (the r-steps in 7r-calculus), modeling the evolution of the system, and 
(ii) an observability predicate |a for each name a which indicates the possibility 
for a process of accepting a communication at a with the environment. P [a 
holds if there is a derivative P' , and an action /i, with subject a, such that 
P-^P' . We also write P IJ-q if there is a derivative P' such that P P' |a- 
We recall that a context C\-] is a process with exactly one hole, written [•], where 
a process may be plugged in. 

Definition 1 (barbed bisimilarity, congruence). Barbed bisimilarity, writ- 
ten is the largest symmetric relation on •K-calculus processes such that 
P Ks Q implies: 

1. If P— fop' then there exists Q' such that Q Q' and P' « Q'. 

2- If P ia then Q )la- 

Let L he a set of processes in TTa, and P,Q £ C. Two processes P and Q are 
barbed congruent in C, written P =c Q, if for each context C[-] in L it holds 
that C[P] « C[Q]. 

The main inconvenience of barbed congruence is that it uses quantification over 
contexts in the definition, and this can make proofs of process equalities heavy. 
Simpler proof techniques are based on labeled characterizations without context 
quantification. 

Definition 2 (ground bisimilarity). Ground bisimilarity, written «, is the 
largest symmetric relation on processes such that if P Q, P— foP', bn(^) H 
fn(Q) = 0, then there exists Q' such that Q Q' and P' Ki Q' . 
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We recall that in the asynchronous calculi without matching, like Ltt, ground 
bisimilarity coincides with early, late, and open bisimilarities |28] . All these re- 
lations are congruences and imply barbed congruence. 

In the technical part of this paper we shall need a means to count the number 
of silent moves performed by a process in order to use up-to techniques [27l I24| . 
The expansion relation |^, written <, is an asymmetric variant of « such that 
P holds if P « Q, and Q has at least as many r-moves as P. 

Definition 3 (expansion). < is the largest relation on processes such that 
P ^ Q implies: 

1. whenever P-^P' , and bn(/r) nfn(Q) = 0, there exists Q' such that Q Q' 
and P' < Q' ; 

2. whenever Q-^Q', and bn(^) nfn(P) = 0, there exists P' such that P-^P' 
and P' < Q' ■ 

In both monadic and polyadic Ltt, barbed congruence is a relation strictly 
larger than ground bisimilarity. For instance, in Ltt, \i P — ah and Q — (uc) (ac \ 
\c{x).bx) then P =L,r Q (see |IS]) but P ^ Q.ln [15], Merro and Sangiorgi give 
two labeled characterizations of barbed congruence for monadic Ltt. One of them 
is based on an encoding of Ltt into ttI, a calculus where all names emitted are 
private |25|. The (polyadic version of the) encoding (essentially Boreale’s j^) is 
an homomorphism on all operators except output, for which we have: 

|a(6)l (izb) (a(c) I c ^ 6) 

where 6 = (6i, . . . , 6„), c = (ci, . . . , c„); names hi and Ci have the same sort for 
all i; cn ({a} U 6) = 0; c ^ 6 x = (xi,. .. ,Xmj)- 

Remark 1. Being recursively defined, the process 6 is not in Ltt, but it is 
ground bisimilar to a process of Ltt (using replication instead of recursion) . 

Given two tuples of names b = (6i,... ,bn) and c = (ci,... ,c„) where 
names hi and Ci have the same sort for all i, we denote with c>b the process 
YYj^i-Cjix). bj{x). Note that |c[> 5] = c — > 5. 

Below, we report a simple adaption to the polyadic case of a few results on |-] 
that have already appeared in the literature: Theorem |T] provides an adequacy 
result w.r.t. barbed bisimilarity; Theorem [5] gives a characterization of barbed 
congruence in Ltt for image-finite processes. We recall that the class of image- 
finite processes (to which most of the processes one would like to write belong) 
is the largest subset X of 7r-calculus process which is derivation closed and such 
that Pel implies that, for all /r, the set |P' : P P'}, quotiented by 
alpha conversion, is finite. 

Theorem 1 (Boreale [4]). Let P and Q be two IjTt - processes then 
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Theorem 2 (Merro and Sangiorgi [Il5| h Let P and Q be two Ltt -processes. 
Then 

1. P =L 5 f Q implies |P] « |Q], for P and Q image- finite proeesses; 

IP] « IQl implies P =l^ Q- 

Remark 2. Theorem has been proved in |15| with respect to asynchronous 
barbed congruence (where only output barbs are taken into account) and an 
asynchronous variant of The adaptation to the synchronous case is straight- 
forward. 

3 Correctness Criteria for Encodings 

When studying an encoding between two languages it is necessary to have some 
eorreetness eriteria in order to assess the encoding. The most common cor- 
rectness criteria for an encoding between two process calculi are based on the 
notions of operational eorrespondenee and full abstraetion. The former relates 
the execution steps as defined by an operational semantics of the source and 
target calculi. The latter relates the source and the target calculi at the level 
of behavioral equivalences. More formally, let us denote with (5, Xg, — >s) and 
(T, Xt, — >t) two process calculi equipped with behavioral equivalences Xg and 
Xt, and transition relations — *-s and — respectively. Let |-] : S i — > T be an 
encoding from 5 to T. A formal definition of operational correspondence is the 
following: 

Definition 4 (operational correspondence). Given two proeess ealeuli 
(iSjXg, — >g) and (T, Xt, — >t), an eneoding |-] : 5 i— > T enjoys a (strong) oper- 
ational correspondence if for eaeh S € S the following two properties holds: 

1. If S' then 1^1 -^t^t IS']. 

2. If [S'] — >t T then there is S' such that S — >g S' and T Xt [S''] . 

Requirements 1 and 2 assert that all possible executions of S may be simulated, 
up to behavioral equivalence, by its translation, and vice-versa. A notion of weak 
operational correspondence can be easily derived from Definition E] by simply 
replacing — >g and — >t with their reflexive transitive closure, in requirements 
1 and 2. 

Full abstraction has two parts: soundness, which says that the equivalence 
between the translations of two source terms implies that of the source terms 
themselves; and completeness, which says the converse. While soundness is a 
necessary property and can be usually derived from the operational correspon- 
dence, completeness is in general hard to achieve because it implies a strong 
relationship between source and target calculi. 

Definition 5 (soundness, completeness, and full abstraction). 

Let (5, Xg, — >s) and (T, Xt — >t) be two process calculi. An encoding 
I'] : S I — !• T is sound if [S’!] Xt IS' 2 ] implies Xg S 2 for each S\,S 2 S S; 
it is complete if Si Xg S 2 implies [S'!] Xt IS' 2 ] for each Si,S 2 G S; it is 
fully-abstract if it is sound and complete. 
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Full abstraction will represent our correctness criterion for the encoding of 
polyadicity that we are going to present in the next section. 

4 Encoding Polyadicity 

In this section we give an encoding of polyadic Ltt into monadic Ltt. We present 
our encoding by comparison with Milner’s encoding of polyadic synchronous tt- 
calculus into monadic synchronous rr-calculus m- Milner’s idea is quite simple: 
One can emulate sending a tuple b by sending a fresh channel w along which all 
names bi are transmitted sequentially. More precisely, Milner gives an encoding 
{| • 1} from polyadic processes to monadic ones which is an homomorphism on all 
operators except input and output for which we have: 

- {|a(a;i,... a(w). INP(w, xi, . . . ,a:„).{|P|} 

- {\a{bi,... ,bn).Q\^ =^(z>-w) (aw. OUT {w, bi, .. . ,6„).{|Q|}) 

where 

- ... ,Xn) w(xi).w(x2) ■ ■ ■ w(xn) 

- OUT (w,bi, . . . , 6„) wbi . wb 2 ■ ■ ■ wbn 

and w is a fresh name, i.e., it is not free in the translated processes. Intuitively, 
INP(w,a;) and OUT(i(;,6) model a protocol which takes care of instantiating 
each variable Xi with the correspondent name bi by using a fresh channel w. 
Since w is private to INP(w,'e) and OUT(w, 6) no interferences are possible. It 
is easy to show that there is an operational correspondence between a polyadic 
process P and its translation {| P |}: (i) if P— then {| P > {| P' |} 

and (ii) if {| P [}— ^Pi then there exists P' such that P— ^P' and Pi > {| P' |} 
where < is the expansion relation (see Definition El) . From the operational cor- 
respondence one can derive the soundness of the encoding when considering 
barbed congruence as the behavioral equivalence in both source and target lan- 
guages. Unfortunately, as it is well-known, Milner’s encoding is not complete and 
therefore it is not fully- abstract. As a counterexample take R = a(x). a(^. 0 and 

5 = a(x).0 I a(jpi.O; then R and S are barbed congruent but their encodings are 
not: {| S' [[ may perform two consecutive inputs along a while in {| P [[ the input 
protocol INP(i(;,ai) blocks the second input along a. In synchronous 7r-calculus, 
a similar counterexample can be given by using outputs instead of inputs. These 
counterexamples essentially say that Milner’s encoding is not fully-abstract be- 
cause the protocols INF (re, a;) and OUT(w, 6) prevent the continuations {| P |} 
and {| Q |} from evolving. Thus, one might think of adapting, somehow, Milner’s 
encoding so that the protocols INP(w, x) and OUT(?ii, b) (or a variant of them) 
are in parallel with the continuations and not in sequence. In (full) 7r-calculus, 
such an adaptation is not possible because of the binding nature of the input 
prefix. This problem can be avoided in Ltt by relying on Lemma [I] which gives, 
under certain hypotheses, an interesting encoding for the substitution operator. 
Recall the definition of a t> 6 from Section El 
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Lemma 1 (Merro and Sangiorgi |15J '). Let a and b be two tuples with the 
same arity and such that a H 6 = 0, and let P be an hn-process such that all 
names Ui £a do not appear free in input position in P. It holds that {i/d) (a > 6 | 
P) Pifi/d}. 

In the following we show how Lemma [T] can be used to define an encoding 
d • ^ of polyadic L7r-processes into monadic ones. For simplicity, we restrict 
ourselves to processes transmitting pairs of names. The general case, when tuples 
of arbitrary size are transmitted, can be derived straightforwardly. The encoding 
d ■ D is an homomorphism on all operators except input and output, for which 
we have: 

- (\a{x).P^ a{w). {vx) (INP(w,F) | d^’D) 

- d®(^)^ (izw) (aw I OUT(w, 6)) 

where w ^ fn(dP^), and supposing x = (xi,X2), y = {yi,y2), b = (61,62), and 
xt>y = \x\{z). yiz \ \x2{z).y2Z we define 

- INP(w,x) =^(iyciC3) (wci I ci(c2).(c^C3 I C 3 {yi).ci{y 2 ).x>y)) 

- OUT(w,6) w(ci). (1ZC2) (cTc 2 I 02(03). (ci'6i | 0162)). 

Like Milner’s encoding, d ’ D is based on the send of a private channel w used by 
INP(w,ai) and OUT(w, 6) for transmitting names bi. Unlike Milner’s encoding, 
in d • D the send of names bi produces n forwarders Xi > bi in parallel. More 
precisely, by Lemma [T] it holds that: 

(\d{b)\a{x).P\)^>{ux){x>b\(\P\)) 

The encoding d • D is sound with respect to barbed congruence. Unfor- 
tunately, in this form, the encoding is not yet fully-abstract because it 
is not complete. As a counterexample take the processes R = d{b) and 
S = {v>d) {d{d) I dt>b), with 6 = (61,62) and d = (^1,^2); then R =l 5 S' 
(see [is]) but d R D /^Ltt d S D, indeed let C[-] = [•] | T in which 
T = a(w). (1ZC1C36.) (woi I 01(02). (0^03 I 03(2/1). oi(?/2)- (^6, I /i(a:).TO))), 

then C[d R D] (fm while C[d S Notice that we may not find a simi- 

lar counterexample by using two ground bisimilar processes R and S. This 
information allows us to give an amended variant of d ■ D • By Theorem we 
know that the encoding |-] of Section maps barbed congruent processes into 
ground bisimilar processes; that is, it holds that P =L/f Q iff |P] « |Q] (on 
image-finite processes). So, we can refine the encoding d ’ ^ by simply combining 
d • D with |-]. More precisely, we define an encoding f-]) of Ltt into Ltt as the 
composition of |-] and d • ^, thus if P is an Llf-process 

n = mi 

Notice that both encodings |-] and d • D are divergence- free, that is, they do 
not introduce infinite internal computations, so also the encoding d[']]) does not 
introduce divergence. 
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5 Proving the Full Abstraction of d-J) 

In this section we shall prove that, on image-finite and well-sorted processes, 
(f-f is fully abstract with respect to barbed congruence. To this end we study 
the encoding |(| • ^ obtained by inverting the order of the application of the 
encodings |-] and d • We first prove an operational correspondence, up to 
expansion, between processes P and |d T" This will allows us to prove the 
soundness of d[-|. Then we derive the completeness of f-]) from a completeness 
result for |d • D]. 

Lemma will allow us to prove the operational correspondence between 
processes P and Id^’Dl- 

Lemma 2 (Boreale [4)1. 

1. Let a, b, c be tuples of names of the same size such that (aUc)n6 = 0. Then 
(ub) (a— >6| b ^c) >a^c. 

2. Let P he an Ltt process and d and b two tuples of names such that the names 
in d do not occur free in P in input-subject position and dD b = 0. Then 
{I'd) (d^b \ |P1) > lPj{b/d}. 

Remark 3. Lemma[2j2) can be seen as a variant of Lemma[l]up to |-]. Actually, 
Lemma [I] follows directly from Lemma |2j2) and Theorem [^2 ). 



Lemma 3. Let P be a well-sorted process in Ltt then: 

1. Suppose that P-^P' . Then we haue: 

(a) ifa = a{x) then ^ (pNP(w,x)] | [dPlD; 

(b) if a = {vc) d{b), with c eventually empty, then 

Id H ^ (M lOUTdrc, 6)1) I Id P' M), 

with p ^ fn{P'); 

(c) ifa = T then [d^M^ ^ Id-P'M- 

2. Suppose that IdPDl — ^Pi- Then there exists P' S Ltt such that: 

(a) if a = a{w) then P-^P', for some x, with 
P, >(z.5i)([INP(u;,5?)] IJdP'M); 

(i^c) a{b) ^ 

(b) if a = d{p) then P >P' , with c eventually empty, p ^ fn{P') and 

P, > {Ud) {{uw) {p^w\ [OUT(u;,6)]) | [dPH); 

(c) if a = T then P-^P' with P^ > |dP' ^]. 

Proof. By transition induction. The only subtle points arise in parts 1(c) and 
2(c) where also Lemma E]is used. Details can be found in [13]. 



Remark j. Note that the lemma above is not true when considering ill-sorted 
processes. For instance, if P = a(6, c) | a{x). Q then |d P D]— ^ while P-^. 
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From Lemma |5| we can derive a weak operational correspondence. 

Lemma 4. 

1. ifp=^p' then 

2. If Pi then there is P' s.t. P P' and Pi > 

3. Pi^a iff 

Proof. Parts 1 and 2 are proven by induction on the number of r-moves by 
exploiting Lemma E] Part 3 follows from parts 1 and 2, and Lemma [31 

Lemmas [3] and (Hallow us to prove the following two lemmas which will be useful 
for proving the soundness of f-]). 

Lemma 5. Let P and Q be two Un -processes. Then 

Id -PH ^ IdQDl implies P ^ Q. 

Proof. We use Lemmas [3] and (Hand the fact that < ^ ^ C « to prove that the 
relation 7?. = {(P, Q) : |d P D] « |d Q DU is a barbed bisimulation. 



Lemma 6. Let P and Q be two Lit - processes. Then 

fP’l « IQI implies P Q. 

Proof. Since d'l dl'lD. by Theorem |H we have |d [PJ D1 « IdlQlDl- % 
LemmalHwe have |P] « |Q]. By Theorem [T] we have P ~ Q. 

The following lemma will allow us to prove the completeness of f-|. 

Lemma 7. Let P and Q be two Irn -processes. Then 

P^Q implies Id P’ D1 ~ Id Q Dl- 

Proof. We prove that S = {([dP’DliIdQDD • P ~ Q} is a ground bisimulation 
up to context and up to > j24]. Details can be found in [13]. 

Finally we prove that, on image-finite and well-sorted processes, the encoding 
d’l is fully-abstract with respect to barbed congruence. 

Theorem 3 (full abstraction of d'f )• P Q image-finite and 

well-sorted processes in Ltt, then 

p =LifQ iff |-P| =l.|Q|. 

Proof. The soundness follows from the compositionality of |-] and d ■ D, and 
Lemma As for completeness, by Theorem (2)^1) we have [P] « [Q]. By 
LemmaOwe have |d [P] D1 ~ [d [Q] Dl- By the monadic variant of Theorem |^2) 
we have d [PJ D =Lt 7 d IQl D> i-e-, dP"! =Lt 7 d<3D- 
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6 What about (| • |) and |(| • |)] ? 

We have proved that the encoding (!•]]) is fully-abstract w.r.t. barbed congru- 
ence. Other possible candidates for a fully-abstract encoding of polyadic Ltt into 
monadic Ltt are: d • D and |d • Dl- Unfortunately, none of them is fully-abstract 
w.r.t. barbed congruence or ground bisimilarity. 

In Section m we already showed that d ' D i® fully-abstract w.r.t. barbed 
congruence. The encoding d • D i® fully-abstract w.r.t. ground bisimilarity 
either. As a counterexample take P — {va) (a{b) \ a{x). c(x)) and Q — c{b)\ then 
P Ki Q, but d -P D ~ (x>b I d c{x) D), and d Q ^ = d c(^) and therefore 

By Lemma [7| the encoding |d • is complete w.r.t. ground bisimilarity. 
Since the encoding |d • D] enjoys an operational correspondence up to expansion 
(Lemma 12]), one may hope that |d • D] is sound w.r.t. ground bisimilarity and 
therefore fully-abstract. Unfortunately, |d • D] is not sound w.r.t. ground bisimi- 
larity. As a counterexample take P = {v>c) a{c) and Q = (i>'c) (o(c) | cr(6)), with 
c = (ci, C2); then P =L5f Q (see jl5]) and also P \) =ltv d Q by Theorem |2] 
Id -P M ~ Id Q Dl; but P ^ Q. The encoding |d • D] is not fully-abstract w.r.t. 
barbed congruence either. As a counterexample take the processes P = a{b) and 
Q = (vd) {d{d) I dt>b), with b = (61, 62) and d = (^1,^2); then, as already shown 
in Section 2 ] P =l5 Q and (\P\) d Q Di since the encoding |-] is sound w.r.t. 
barbed congruence (which follows by Theorem [Tjand the compositionality of |-]) 
we have that IdP’Dl IdQDl- 

7 An Encoding of Polyadicity in Calcnli with Non-binding 
Input Prefix 

In Section | 4 | we said that Milner’s encoding is not fully-abstract because the 
protocols INP(?ii,a:) and OUT(w,6) prevent the continuations {| U |} and {| Q |} 
to evolve. Actually, the real problem is the binding nature of the input prefix. 
Indeed, we can easily change the encoding of d{b).P by putting the protocol 
OUTdw, b) in parallel with the continuation but we cannot do the same with the 
encoding of input prefixes. 

On the contrary, in calculi where the input prefix is non-binding, such as Chi 
calculus [^, Fusion calculus [ 1 ^ and nF-calculus [ 10 ], we can adapt Milner’s 
encoding by simply putting the protocol INP(z;;,'z;) in parallel with the contin- 
uation. We conjecture that, in these calculi, such a variant of Milner’s encoding 
is fully-abstract. 

Let us consider, for instance, the Fusion calculus. 0 For our purposes it suf- 
fices to consider a finite fragment. The extension of our encoding when infinite 
processes are allowed is straightforward. The grammar of finite Fusion calcu- 
lus has operators of inaction, non-binding input prefix, output prefixing, parallel 
composition, and restriction: 

^ Actually, it might be easier to work with the jvF-calculus. We consider the Fusion 
calculus and not ttF only because the theory of Fusion is more stable. 
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P::=0 I a{x).P \ a{b) . P | P|P | {a)P 

Conventions about names are as in 7r-calculus, except for the non-binding input 
prefix for which we have: 

fn(a(ai). P) {a} U {xi, . . . , a;„} U fn(P) and bn(a(ai). P) bn(P). 

We write {x)P for {xi) . . . {xn)P- Conventions about processes and substitutions 
are as in 7r-calculus. The reduction semantics is defined by means of a notion of 
structural congruence (essentially the same as in 7r-calculus) and a reduction re- 
lation. For simplicity, we give the basic reduction rules for the monadic calculus, 
the generalization to the polyadic case is slightly more complex: 

(comml) : {x){P \ a{x) . Q\a{y).R) — > (a;)((P | Q \ R){V/x}) 

(comm2) : (j/)(P | a(a;). Q | a(?/). P) — > {y){{P\Q\R){^/y}) 

Notice that the restrictions {x) and {y) in the derivatives make sense only when 
X = y, otherwise, up to structural congruence, they disappear. The definitions 
of observability, barbed bisimilarity, and barbed congruence are essentially the 
same as in 7r-calculus. Finally, an important derived process, called fusion, can 
be defined as follows: {ic = ^ (u)(u(x}.0 | u{^.0). 

In Fusion calculus, communications can arise only in the presence of a scoping 
construct delimiting their effects (see reduction rules (comml) and (comm2)). 
So, to observe all potential communications (and their effects) it makes sense to 
consider a notion of barbed congruence obtained by closing barbed bisimulation 
under contexts that bind all free names of the tested processes. Similar closing 
contexts have been used in typed calculi m- As in the definition of testing 
equivalences [Zj, these contexts signal success by emitting along names that do 
not appear in the tested processes. 

Definition 6 (closed barbed congruence). Two processes P and Q are 

closed barbed congruent, written =c, if for each context C[-] such that fn(P) H 
fn(C[P]) = fn(Q) n fn(C[Q]) = 0, it holds that C[P] « C[Q]. 

It is immediate to adapt this definition to other calculi, such as 7r-calculus. Ac- 
tually, in TT-calculus and CCS, closed barbed congruence coincides with barbed 
congruence: One can prove that closed barbed congruence coincides with the clo- 
sure under substitutions of early bisimulation, by adapting the proofs in [121 13; 
since the closure under substitutions of early bisimulation is known to coincide 
with barbed congruence, the two definitions of barbed congruence coincide. Un- 
fortunately, standard and closed barbed congruence do not coincide in Fusion. 

Milner’s encoding can be rewritten in Fusion calculus so that the instantiation 
of names does not block the continuations: 
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da(a;i, ... ,a;„). P|) = {w)aw. {wxi.wx 2 wxn | d^D) 

da(6i,... ,6„).QD {w)aw.{wbi.wb2....wbn | dQD) 

Note that the continuations d P D and d Q [) niay evolve without waiting for the n 
communications along the private channel w. This encoding is not sound w.r.t. 
standard barbed congruence and not even w.r.t. hyperequivalence CSl, because, 
in some sense, the encoding breaks the preemptive power of fusions. More pre- 
cisely, if we take R — {a = b} \ {c = d} and S = {a = b}. {c = d}@, then R and 
S are not equivalent while their translations are. Nevertheless, we believe that 
the encoding d • [) is fully abstract with respect to closed barbed congruence. Our 
conjecture is due to the fact that closed barbed congruence is insensitive to fusion 
prefixing, that is, it handles fusion actions as silent moves. As a consequence, the 
counterexample above is not valid anymore because processes {a = b} \ {c = d} 
and {o = b}.{c = d} are closed barbed congruent. Unfortunately, we cannot 
prove the full abstraction of d • |) using the same proof techniques of Sec- 
tion 0 because we do not know yet a labeled characterization of closed barbed 
congruence in Fusion. 



8 Conclusions and Related Works 

We have presented a divergence-free encoding f-]) of polyadic Ltt into monadic 
Ltt inspired by Milner’s encoding. The encoding exploits a property of Ltt saying 
that, under certain hypotheses, substitution can be encoded in terms of links, 
restriction, and parallel composition. This property allows us to define an en- 
coding of polyadicity where the machinery emulating the transmission of tuples 
does not block continuations. 

We have proved that, on image-finite and well-sorted processes, {f-f is fully- 
abstract with respect to barbed congruence. This shows that in Ltt (i) polyadic- 
ity does not add extra expressive power, and (ii) when studying the theory of 
polyadic Ltt we can focus on the simpler monadic variant. Finally, we have pro- 
posed an encoding of polyadicity in name-passing calculi with non-binding input 
prefix, such as Chi, Fusion and irF calculi Elfinilin], which is based on the same 
idea of d-|. 

Of course, our encodings (as the Milner’s one) do not preserve well-sortedness. 
This is a minor point because in monadic calculi there cannot be arity mismatch- 
ing. 

Note that we have used synchronous barbed congruence where both input and 
output actions are observed. Sometimes, in asynchronous calculi, only output 
barbs are taken into account validating the law a(x). d{x) = 0. This law may be 
questioned because it introduces divergences. For instance, from a{x).d{x) = 0 
we can derive the equality \a(x).d{x) \ ab = ab between a divergent and a 
non-divergent process. Our encoding is not complete w.r.t. asynchronous barbed 
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congruence precisely because (|a(x). a(a:)]]) and fO]) are not asynchronous barbed 
congruent. However, we believe that d-f is fully-abstract w.r.t. a variant of 
asynchronous barbed congruence which is sensitive to divergence, along the lines 
of [28]. 

The works which are most closely related to ours are [2^ |2Tj where type 
systems for monadic 7r-processes are introduced in order to capture the commu- 
nication protocol underlying Milner’s encoding. More precisely, in |29| a notion 
of graph type is introduced and studied. Nodes of a graph type represent atomic 
actions, and edges an activation ordering between them. The approach in 1211 is 
similar but the type system is simpler. Both papers show a full abstraction result 
with respect to typed contextual equivalences that reject all contexts which do 
not respect the protocol imposed by the encoding. While [2SII2I] work on the 
full TT-calculus our result only applies in Ltt. This is because Lemmas [D and |2] 
only work on L7r-processes. On the other hand, we prove a sharper result because 
we get the completeness of the encoding with respect to all monadic contexts 
without rejecting “hostile” contexts. 

In [Bj, Fournet and Gonthier provide, among other results, a fully-abstract en- 
coding of polyadic into monadic Join Calculus. Apart from the differences among 
the two process calculi, the encoding in |B] is technically quite different from ours; 
for instance, as the authors themselves say, their translation encodes and then 
subsequently decodes tuples twice. 
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Abstract. Using rationality, like in language theory, we define a family 
of infinite graphs. This family is a strict extension of the context-free 
graphs of Muller and Schupp, the equational graphs of Courcelle and the 
prefix recognizable graphs of Caucal. We give basic properties, as well 
as an internal and an external characterization of these graphs. We also 
show that their traces form an AFL of recursive languages, containing 
the context-free languages. 



1 Introduction 

When dealing with computers, infinite graphs are natural objects. They emerge 
naturally in recursive program schemes or communicating automata, for exam- 
ple. Studying them as families of objects is comparatively recent: Muller and 
Schupp (in |MS 85j l first captured the structure of the graphs of pushdown au- 
tomata, then Courcelle (in [Co 90j l defined the set of regular (equational) graphs. 
More recently Caucal introduced (in |C^96]) a characterization of graphs in 
terms of inverse (rational) substitution from the complete binary tree. Step by 
step, like Chomsky’s languages family, a hierarchy of graph families is built: the 
graphs of pushdown automata, regular graphs and prefix-recognizable graphs. 

To define infinite objects conveniently, we have to use finite systems. For 
infinite graphs, two kinds of finite systems are employed: internal systems or ex- 
ternal systems. Roughly speaking an internal characterization is a machine pro- 
ducing the arcs of the graph. An external characterization yields the structure 
of the graph (usually “up to isomorphism”). There is, of course a relationship 
between internal and external characterization: for example the pushdown au- 
tomata are an internal characterization of the connected regular graphs of finite 
degree whereas the deterministic graph grammars are an external system for 
the family of regular graphs. 

The purpose of this article is to give both internal and external characteri- 
zation of a wider family of graphs. Using words for vertices, rationality (like in 
language theory) will provide an internal characterization; it will also give basic 
results for this family: for example rational graphs will be recognized by trans- 
ducers; a rational graph is a recursive set; determinism for rational graphs will 
be decidable. Then inverse substitution from the complete binary tree (like in 
jCa 96j ) will be an external characterization of this family. Strangely this exten- 
sion will prove to be a slight extension of the prefix-recognizable graphs: instead 

J. Tiuryn (Ed.): FOSSACS 2000, LNCS 1784, pp. 252- f2EUl 2000. 
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of taking the inverse image of the complete binary tree by a rational substitution 
we will consider the inverse image of the complete binary tree by a linear substi- 
tution (z. e., a substitution where the image of each letter is a linear language). 
Finally properties of the traces of these graphs will be investigated: we will show 
that the traces of these graphs form an abstract family of ( recursive ) languages 
containing the context-free languages. 

2 Rational Graphs 

In this section we will define a new family of infinite graphs, namely the set of 
rational graphs. We will state some results for this family and give examples of 
rational graphs. 

2.1 Partial Semigroups 

This paragraph introduces rationality for partial semigroups and uses this notion 
to give a natural introduction for rational graphs. 

We start by recalling some standards notations: for any set E, its cardinal is de- 
noted by |if I; its powerset is denoted by 2®. Let the set of nonnegative integers be 
denoted by N. A semigroup S' is a set equipped with an operation ■ : S x S ^ S 
such that: for all in S there exists w in S such that fu^v) = w denoted by 
u-v = w and this operation is associative {i.e., Vu, v,w G S, (u ■ v) ■ w = u ■ {v ■ w) . 
Finally, a monoid M is a semigroup with a (unique) neutral element (denoted 
e along these lines) i.e., an element e G M such that for all element u in M 
u • e = e • u = u. 

Now, a partial semigroup is a set S equipped with • : S x S — > S, a partial op- 
eration, with T> C S X S the domain of •; set T> need not be S x S. Moreover 
we impose this operation to be associative as follows: [(u, v) G T> A ((u • v), w) G 
T>] AA [(u, w) G T> A {u, {v ■ w)) G T>] and in that case, u ■ {v ■ w) = {u ■ v) ■ w. 
Meaning that if multiplication is defined on the one side, then it is defined on 
the other side and both agree. 

Notice that a partial semigroup S such that T> is S x S is a, semigroup. 

Example 2.1. Given two semigroups (S'!,-!) and (52, - 2 ) such that Si n S 2 is 
empty. The union S = 5i U ^ 2 , with the partial operation • defined as -i over 
the elements of Si and -2 over the element of ^ 2 , is a partial semigroup. 

Taking a new element T we complete any partial semigroup S into a semigroup 
S U {T} by extending its operation • as follows: 

a • 6 = T for all a, 6 G 5 U {T} such that (a, b) ^ V. 

Also the product 5 x 5' of two partial semigroups S and S' is a partial semigroup 
for operation • defined componentwise: 

(a, a') ■ (6, b') = {a ■ b,a' ■ b') for all (a, b) G T> and (o', b') G V . 

In order to define the rational subsets of a partial semigroup, we have to extend 
its operation to its subsets: 

A - B \= {a-b I a G A A b G B } for every A,BCS 
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The powerset 2^ of S, is a semigroup for • so defined. 

Now, a subset P of a partial semigroup S' is a partial subsemigroup of S, if P is 
a partial semigroup for • of S z.e., P • P is a subset of P. 

For any subset P of a partial semigroup S, following subset P+ = Un>i -P" 
(with P^ = P and P”+^ = P” • P for every n ^ 1) is the smallest (for inclusion) 
partial subsemigroup of S containing P. Set P"*" is called the partial semigroup 
generated by P. In particular = P"*". Also, S is finitely generated if 

S = P"*" for some finite P. 

A set P C S is a code if there is no two factorization in P+ of the same element: 
U\ - • • Um = A Ml, . . . , Um, Vi, . . . ,Vn & P m = u A \f i € 

[1 ■■■n],Ui = Vi 

A partial semigroup S is free if there is code P such that P+ = S. 

For every W C 2'^, we denote by IJ IF = {a | 3P S IF, o G P}. Operator + 
commutes with operator (J, i.e., = ((J IF)+ for every IF C 2‘®. 

The (left) residual u~^ P of P C S by u G S is following subset: 
u~^P := {v G S \ u ■ V G P} 
and satisfies following basic equality: 

(u • v)~^P = v~^{u~^P) for all u,v G S and PCS. 

Definition 2.2. Let (S', •) be a partial semigroup. The family Rat{S) of ratio- 
nal subsets of S is the least family TZ of subsets of S satisfying the following 
conditions: 

(i) 0 G P; {to} G TZ for all to in S; 

(ii) if A, P G P then AU B, A ■ B and A+ G TZ. 

In order to generalize well known results for monoids in the case of partial 
semigroups, and as our purpose is to deal with graphs, we will set some notations 
and definitions for graphs and automata. 

Let P be a subset of S. A (simple oriented labelled) P-graph G over V with arcs 

labelled in P is a subset of F x P x F. An element (s,a,t) in G is an arc of 

source s, goal t and label a (s and t are vertices of G). We denote by Dom{G), 

Im{G) and Vq the sets respectively of sources, goals and vertices of G. Each 

(s, a, f) of G is identified with labelled transition s t or simply s — ^ t if G is 

G 

understood. 

A graph G is deterministic if distinct arcs with same source have distinct label: 

r Afo s A r — ^ t s = t. A graph is (source) complete if, for every label 

a, every vertex is source of an arc labelled a: Va G P, Vs G Vg, s-^t. 

Set 2^^^ of P“'"-graphs with vertices in F is a semigroup for composition 

relation: G ■ H := {r t | 3s, r Afo s A s — ^ f\ for any G, P C F x P+ x F. 

G H 

Relation denoted by or simply if G is understood, is the existence 
of a path in G labelled u in P~^ . For any L in S, we denote by s =A t that there 
exists u in L such that s t. 
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The trace (or set of path labels) L{G, E, F) of G from a set if to a set F is the 
following subset of P'*': 

L{G, E,F) := {u€S \ 3 s€E, 3t€F, s^t} 

G 

Given P C S, a, P-automaton A is a P-graph G whose vertices are called states, 
with an initial state i and a subset F of final states; the automaton recognizes 
subset L{A) of P'^: L{A) := L{G,{i},F). An automaton is finite (resp. de- 
terministic, complete) if its graph is finite (resp. deterministic, complete). This 
allows to state a standard result for rational subsets. 

Proposition 2.3. Given a subset P of a partial semigroup S, Rat{P'^) is 

(i) the smallest subset of 2^ containing 0 and {o} for each a G P, and closed 
for U,-,-h 

(ii) the set of subsets recognized by finite P-automata, 

(Hi) the set of subsets recognized by finite and deterministic P-automata. 

We simply translated the standards definitions of rational subsets of monoids 
given for example in |Be 79 1 . An interesting example of a partial semigroup is 
the subject of these lines: the set of arcs (labelled with an element of a finite set) 
between elements of a free monoid is a partial semigroup; its rational subsets 
are the rational graphs. 

2.2 Partial Semigroups and Graphs 

In this section, we will consider an important example of partial semigroup: the 
set of rational graphs. So consider an arbitrary finite set X and denote X* its 
associated free monoid. We will consider graphs as subsets of X* x^x A* (the 
set of graphs over X* with arcs labelled in A). For convenience, set 2^ xxlxx jg 
denoted Ga{X*). 

Now, with {u,ai,v) -i {u' ,ai,v') = {u • u',ai,v ■ v'), set X* x {ci} x X* (a^ in 
A) is a monoid. As stated in Example l2.il the union of these monoids (namely 
X*xAxX*) is a partial semigroup. We denote by • the operation in X*xAxX* 
(which is -i for each X* x {oi} x X*). 

Remark: this • operation for graphs is indeed, similar to the synchronization 
product for transition systems defined by Nivat and Arnold in |AN 88| . 

We are now able to define the set of rational graphs. 

Definition 2.4. The set of rational graphs, denoted Rat{X* x Ax X*) is the 
family of rational subsets of X* xAxX*. 

Let us now recall that a transducer is a finite automaton over pairs (see for 
example |Aii 88j pe 79j ). A rational relation {i.e., a rational subset of X* x X*) 
is recognized by a rational transducer. 

There is a strong relationship between rational graphs and rational relations and 
to characterize the family of rational graphs in a more practical way we will use 
labelled transducers. 
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Definition 2.5. A labelled transdueer T = (Q,I,F,E,L) over X, is composed 
of a finite set of states Q, a set of initial states / C Q, a set of final states F C 
a finite set of transitions (or edges) E C Q x X* x X* x Q and an application L 
from F into 2^. 



Like for P-graphs, transition (p, u, v, q) of transducer T will be denoted by p — ^ q 

or simply p — ^ g if T is understood. Now similarly an element (u, d, v) € X* x 
AxX* is reeognized by transducer T if there is a path po — *■ Pi ' ■ 'Pn-i ~ — " Pn 

T T 

and po G I , Pn G F, u = ui ■ ■ ■ Un, v = v\ ■ ■ ■ Vn and d G L(pn). 

Remark: an illustration of transducer execution will be given in Example 12. 71 



Proposition 2.6. A graph G in Gji{X*) is rational if and only if it satisfies 
one of the following equivalent properties: 



(i) G belongs to the smallest subset of Gyi{X*) eontaining: 

0,{e — ^ e},{a; — ^ e} and {e-^x}, for all x G X, all d G A, and closed 
under U, • and +; 

(ii) G is a finite union of rational relations over each letter: 

G = UdeA^d, for Rd G Rat{X* x {d} x X*); 

(Hi) G is recognized by labelled rational transducer. 



This Proposition states that for any graph G in Rat{X*xAxX*) , the relation: 

— ^ := {(u, v) I u — ^ f } is rational for each d in A. Therefore we also introduce 
G G 

— > := {JdeA which is also a rational relation. Naturally we denote by 
G G 

— ^(u) (resp. — *■(«)) the image of word u by relation (resp. — >) (and 
G G G G 

similarly for subsets of A) . Also for a rational graph G there are possibly many 

transducers generating it, thus we will denote by 0{G) the set of transducers 
generating G. 

We will now give some examples of rational graphs. 



Example 2.7. This graph : 



b 

AB 



is a rational graph generated by this transducer : 



AjA 
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Notice that its second order monadic theory is undecidable and therefore ratio- 
nal graphs have an undecidable second order monadic theory. 

Why does the arc (AB, b, AB'^) belong to the graph? Simply because the follow- 
ing path is in the transducer: 



A/A e/B B/B 
P >P >Q2 >Q2 



and that b is associated to the final state q2- 



Example 2.8. This graph : 



000 




111 



is rational, generated by this transducer : o/o 

o 




We finish with a last example showing that the transition graphs of Petri 
nets are rational graphs. 

Example 2.9. For more detail on Petri nets the reader may refer to [Re 85J . A 
Petri net can be seen as a finite set of transitions of this form: 

^2^ • • • ^ ^2^ • • • with Af representing there are x coins in 

place Ai {d represents the label (if any) of the transition). Following transducer 
generates the transition graph associated to the above transition: 



A2/A2 




Each vertex of the generated graph correspond to a marking of the Petri 
net. Each arc of the graph represents that a transition has been fired. 
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2.3 Some Results for Rational Graphs 

This section will introduce results for this family of graphs. Some of these results 
are just a reformulation of known results over rational relations. Others are 
simple facts on these graphs and their boundary. 

The first fact is that this family is an extension of previous families. Simply 
recall that every prefix-recognizable graph (defined in |Ca 96 |1 is a finite union of 
graphs of the following form : 

( U V) -W := { uw vw I u€UAvGVAw€W} 
with U, V, W rational sets. 

This characterization ensures that prefix-recognizable graphs are rational graphs. 
As the regular graphs (defined in [Co 90J i are prefix-recognizable graphs, they are 
rational too. Furthermore, the graphs in Examples 12.71 and 12.81 are not prefix- 
recognizable graphs thus the inclusion is strict. Let us now translate some well- 
known results for rational relations, to rational graphs (the proofs will be omitted 
they are mostly direct consequences of results found in |Au 88| and lEe 791 1 . 

Proposition 2.10. A rational graph G is of finite out-degree if and only if there 
exists a transducer T € 0{G) such that there exists no cycle in T labelled on the 
left with the empty word which is not labelled on the right with the empty word. 
In other words the only cycles labelled on the left e, are labelled on the right e. 

Remark: naturally this proposition can be translated to characterize the graphs 
of finite in-degree, by simply replacing right by left and vice-versa. 

Proposition 2.11. Every rational graph is recursive: it is decidable whether an 
arc {u, d, v) belongs to a rational graph. 

Theorem 2.12. It is decidable whether a rational graph is deterministic (from 
its transducer) . 

Proposition 2.13. The inclusion and equality of deterministic rational graphs 
is decidable. 

Remark: unfortunately this result ceases to be true for general rational graphs 
( [Be 79j Theorem 8.4, page 90). 

We have already seen that the second order monadic theory of these graphs is 
undecidable in general. We will now see that it is also the case for the first order 
theory. 

Proposition 2.14. The first order theory of rational graphs is undecidable. 

Proof. We will prove this proposition by reducing Post’s correspondence prob- 
lem (P.C.P.) to this problem. Let us recall the P.C.P.: given an alphabet X and 
(uo,vo), (ui,vi),..., (un,Vn) elements of X* x X* . Does there exist a sequence 
0 ^ ii, i 2 , . . . ,im ^ n, such that itouq • • • ? To an instance of 

P.C.P. (i.e. a family {ui,Vi)) we associate following transducer: 
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The resolution of P.C.P. becomes finding a vertex s such that s — > s is an arc 
of the graph generated by the transducer. It is a first order instance, therefore, as 
P.C.P. is undecidable, the first order theory of rational graphs is not decidable 
in general. □ 

Before giving another negative decision result, let us denote by ii the mirror 
of word u (defined by induction on the length of m: £ = e and a/u = ua (for any 
u with |u| ^ 0). 

Proposition 2.15. Accessibility is not decidable for rational graphs in general. 



Proof. Once again, we use P.C.P. Using the same notations as earlier define a 
(word) rewriting system G, using two new symbols # and $, in the following way: 

{ $ — > Vz G {0, • • •, n} 

$ # 

Aif^A — > # yA & X 



Now “P.C.P. has a solution” is equivalent to the existence of a derivation from 
to #. But, considering the following transducer: 



A/A({or A € X) 



A/A({or A € X) 




S/# 

$/-Ui$ii(tor i e {0, 



A#A/#(tor A e X) 



the question becomes: is there a path leading from uo$vq to the vertex # ? 
Answering the last question would allow P.C.P. to be solved in the general case 
which is a contradiction. Therefore accessibility is undecidable for the rational 
graphs in general. □ 

Remark: the transitive closure of a rational graph is, at least, uneffective. If this 
construction were effective and rational, then accessibility for rational graph 
would be decidable. 

Now we will see a case where accessibility is decidable for rational graphs. A 
transducer T is increasing if every pair (zz, v) recognized by T is such that the 
length of V (denoted by |u|) is greater or equal to the length of zz : |z>| ^ |zz|. 

Proposition 2.16. The accessibility is decidable for any rational graph with an 
increasing transducer. 

Proof. Let us denote by T^”(zz) following set: T^”(zz) := \J^^qT^{u). For all 
zz G N this set is rational. 

Now, let G be a rational graph generated by an increasing transducer T and let 
zz and v be two vertices of G. Let us put no = |{zc G X* \ |zz| ^ |zz;| ^ |z;|}| = 
|A|I“I + ••• + |A|I”I. Vertex V is accessible from zz if and only if v belongs to 
T^”“(zz). Thus accessibility is decidable for rational graphs with an increasing 
transducer. □ 
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We now give a technical Lemma that allows the construction of a graph that 
is not structurally rational. 

Lemma 2.17. Let G be a rational graph of finite out-degree. There exists two 
integers p and q sueh that for every (s, a,t) G G we have |t| ^ p.|s| + q 



Example 2.18. Consider an infinite tree in X* xAxX* such that every vertex 
of depth n has 2 ^ sons. This tree is not strucurally rational, in other words 
whatever name are given to its vertices this graph is never a rational graph. This 
is a direct consequence of previous lemma: say n is the length of the root, there 
are at most vertices of depth 1 . 



Despite these results the transducers are not able to capture the structure of 
rational graphs. a/ a 

For example, this transducer: O 

— 0 - 



e/AB 



BIB 

o 



e a a A^ B^ 

generates this graph: ^ a ^ b ^ a * b ^ 

B a AB'^a A^ B“^ 



The connected component of the empty word, e, is a straight-line. It is “up 
to isomorphism” obviously rational, but as a sub-graph of this graph, it is not 
rational (its vertices form a context-free language) . Therefore we need an external 
(“up to isomorphism”) characterization of these graphs. This is the subject of 
the next section. 

3 An External Characterization 

In this section, we will characterize rational graphs using inverse linear substitu- 
tions. Labelled transducers are an internal representation of rational graphs, it 
clearly depends on the name of the vertices. But often in graph theory, the name 
of the vertices is not relevant, it carries no information. An external characteri- 
zation, like the graph grammars for equational graphs, produces graphs without 
giving names for vertices. It only gives the structure of the graph. Inverse linear 
substitution is an external characterization of rational graphs. 

3.1 Graph Isomorphism 

An external characterization of rational graphs is given “up to isomorphism” . 
Two graphs Gi and G2 in G_a{X*) are isomorphic, if there is a bijection ip : 

V{Gi) -A V{G 2 ) such that: si-^S2(i.e., (si,d, S2) G Gi) if and only if 

G\ 
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Two isomorphic graphs have the same structure: they are the same up to a 
renaming of the vertices. 

Now let us consider the equivalence (=) generated by graph isomorphism: we say 
that G\ is equivalent to G2 (denoted G\ = G2) if G\ and G2 are isomorphic. This 
equivalence relation provides us with a partition of Ga{X*) denoted Graph := 
G^(df*)/ =. This allows the introduction of the set of structural rational graphs: 

GRatA := {[G]= G GraphA \ G G Rat{X* xAxX*)} 

This set is the set of graphs that are isomorphic to some rational graph. 

Set GraphA (and GRatA) does not depend on the choice of set X , therefore we 
can choose X to be any two letters alphabet with no loss of generality. 

Lemma 3.1. For all subset X' (with at least two elements) of X and all class 
[G]= of GraphA (= Ga{X*)/ =) there exists Go in Ga{X'*) such that Gq G 

[G].. 

We now have to characterize the structure of GRatA- This is the goal of the 
next section. 

3.2 Substitution 

Recall the definition of the prefix-recognizable graphs (family REG^at)- This 
family has been defined as the set of graphs obtained from the complete binary 
tree by inverse rational substitution, followed by rational restriction. We will 
use the same process (actually a linear context-free substitution) to obtain the 
family of rational graphs. 

A substitution over a free monoid X* is a morphism (p : A* — > 2 ^ , which 
associates to each letter in A a language in X*. Our purpose is to study graphs, 
starting from the complete binary tree (A) labelled X = {A, R}. To move by 

inverse arcs, we use a new alphabet : X = {A,B) and we say that x^-^y if 
A L 

y — > X. Given a language L and two vertices x and y, recall that x => y 3rt G 

A 

L, X y. Now, given a substitution p : A* ^ , we can define the graph 

A 

in the following way: 

</j"^(A) = I d G AA x'^y} 

Given a language L, we define now La = {s I r s}. It allows us to consider 

A 

the graph (/?“^(A)|^^: it is the image of the complete binary tree by an inverse 
substitution followed by a restriction; if L is rational, we say a rational restriction. 

Example 3.2. Example 12.71 states that the grid is a rational graph. Following 
substitution: h(a) = {B AB^\ m ^ 0}, h{b) = {B} over the complete binary 
tree on {A,B}, followed with the restriction to L = A*B* produces a graph 
isomorphic to the grid: 
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Now, it is well know that there is a close relationship between linear languages 
and rational relations (a linear language is a context-free language generated by 
a grammar with only, at most, one non-terminal on the right hand side of each 
rule). And indeed, if we denote the set of linear languages over the alphabet 
A U A by Lin{X U A), we have the following proposition. 

Proposition 3.3. The set GRatj\^ is a subset of the family of the graphs ob- 
tained from the complete binary tree (A) by an inverse linear substitution, fol- 
lowed by a rational restriction: 

GRatA Q {[ip~\A)iLj= I Vd e A, ip{d) S Lin{X U A) A L e Rat{X)} 

Proof (Sketch). We first transform the transducer generating the graph (G) so 
that each vertex begins with the same prefix. Then we produce linear languages 
(Ld) such that {u, d, v) G X* x A* is an arc of G if and only if uv G Ld- We 
then define ip{d) to be Ld- It only remains to define L (the rational restriction) 
to be L := Dom{G) U Im{G) □ 

The converse of this result would help us to grab the structure of rational 
graphs. Unfortunately it is not obvious. Actually the following example illustrate 
the difficulty of the naive converse of Proposition 13.31 

Example 3.4. Consider ipia) = {BBA^B"^] n G N}, it is a linear substitution. 
Consider L = BA*B* and the graph G = Structurally, graph G is 

rational (it is the star). But the graph naturally associated to G (according to 
g}{a) and L) is G' = {{B, a, BA"‘B'^)\ n G N}, which is not rational. 

So there is a deep isomorphism problem to get the converse. Actually, we 
will try to inject rationality in the “linear language” to achieve a complete char- 
acterization of rational graphs. 

A natural way to introduce rationality into Lin{X U A) would be to im- 
pose the projections over barred and non-barred letters to be rational. The next 
example shows that again, things are not so nice. 

Example 3.5. Consider ip{a) = {ABBA'^B'^\ n ^ m} U {BBA'^B'^\ m > n} 
is a linear substitution. Moreover it has rational projections over barred 
and non-barred letters. Consider L = BA*B* and the graph G = 
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Structurally, graph G is rational (it is two stars). But the graph naturally 
associated to G (according to ip{a) and L) is G' = {{BA,a, n ^ 

to} U {{B,a, BA^B'^)\ to > n}, which is not rational (its intersection with the 
recognizable set {BA} x {a} x BA*B* is {(BA,a,BA'^B"^)\ n ^ to} which is 
not rational). 

Now consider the set Ratlin{X U X) of linear languages (called rational- 
linear) over (XUX)* such that the production of their grammars are of following 
form: p — > uqv (with u G X and v £ X*) or p ^ s. 

Theorem 3.6. Set GRatj^ is precisely the set of graphs obtained from the com- 
plete binary tree (A) by a rational-linear substitution, followed by a rational 
restriction : 

GRatj^ = {[lp~^{A)\l^]= I Vd G A,ip{d) G Ratlin{X U X) A L G Rat{X)} 

Proof (Sketch). The first inclusion is treated in Proposition 13.31 For the reverse 
inclusion we first take a graph G image of a rational-linear substitution, followed 
by a rational restriction then we need to check that it is possible to produce a 
transducer from the grammars of p{d) for each d. Then we show that this graph 
contains G, finally, using rational intersection we obtain precisely G. □ 

Now that an external characterization of the rational graphs has been given, 
the next section will consider the properties of the traces of rational graphs. 



4 The Traces of Rational Graphs 

We have already seen that there is a strong connection between language theory 
and rational graphs. In this section we will see another connection between 
graphs and languages, in terms of traces. 

We first recall that the trace of a graph G leading from a vertex set I (of initial 
states) to a vertex set F (of final states) is the set of all the path labels in the 
graph, leading from a vertex in the set of initial states to a vertex in the set of 
final states: 



L(G, /, F) := {u I 3s G / G F, s t} 

G 

In other words the trace of a graph is “the language of its labels” . For example 
the traces of the finite graphs are all rational languages and the traces of prefix- 
recognizable graphs are all context-free languages. Notice by the way that the 
traces of rational graphs contain therefore every context free language. 

Proposition 4.1. The traces of rational graph leading from a rational vertex 
set to a context free vertex set (or vice-versa) is recursive 
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Proof. In order to check whether a word u is in the trace of graph G (from a 

set / to a set F), it is just to check if the set S = ■ ■ ■)) 

G G G 

intersects set F. If set I is rational {resp. context-free) its image by a rational 
transduction is rational {resp. context-free), hence by a simple induction, set S 
is rational {resp. context-free). Therefore it is decidable whether S'nf is empty. 

□ 

Let us denote by TR the family of the traces of rational graphs leading from 
a rational vertex set to a rational vertex set: TR = {L{G, I,F)\G G Rat{X* x 
AxX*) A I,F G Rat{X*)} (notice that we could as well restrict ourselves to a 
unique initial state and a unique final state) . Now we will show that set TR form 
an Abstract Family of Languages (AFL), that is, it satisfies following properties: 



— closure for intersection with a rational (regular) language, 

— closure under non-erasing (monoid)morphism, and inverse morphism, 

— for each L, L' G TR we have L ■ L', Lf] L' , L+, L* G TR. 

Proposition 4.2. The intersection of two elements ofTR is an element ofTR. 

Proof. Consider two elements L and L' of TR. Say L = L{G,I,F) and L' = 
L{G', I, F). The language LD L' is actually the trace of G ■ ({$} x A x {$}) • G' 
(with $ a new symbol) between Iq ■ {$} • Ic and Fq ■ {$} • Fq'. Hence L n L' in 
an element of TR. □ 

As rational languages are traces of rational graphs (finite graphs are rational 
graphs), family TR is closed under intersection with rational languages. 

Now let us recall that a finite {resp. rational) substitution <j \ A* ^ 2-^ is a 
morphism such that for each letter d in A a {d) is a, finite {resp. rational) subset 
of A*. A substitution is non-erasing if e ^ cf{d) for all d G A. 

Proposition 4.3. Family TR is closed under non-erasing finite substitution. 

Proof (Sketch). Consider a a non-erasing finite substitution, and L a language in 
TR. We take a graph G such that L = L{G, I, F), and T a transducer generating 
G. We, then, construct a new transducer such that each production d in T is 
replaced by a path u (in the corresponding graph), for each u G a{d). The trace 
of the graph generated by this transducer is cr{L) □ 

Following corollary is a direct consequence of this proposition. 

Corollary 4.4. Family TR is closed under non-erasing morphism. 

Notice that the condition “non-erasing” is essential for our proof. A interest- 
ing question is whether this condition is necessary. 

Proposition 4.5. Assume that L is an element of TR and that a is a finite 
substitution over A* then a~^{L) is a language ofTR. 
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Proof (Sketch). This proposition is a consequence of Elgot and Mezei’s theorem, 
which states that the composition of two rational relations is a rational relation 
(see for example [Be 79] . Theorem 4.4 p 68). Using this we can produce a rational 
graph in which a finite number of finite path are replaced by arcs. Which proves 
the Proposition. □ 

Remark: Note that it is not as straightforward for inverse rational substitution. 
Actually it seems that it is not true for inverse rational substitution: consider any 
rational graph with one label (a) and the inverse rational substitution cr{a) = a* . 
The graph image with the same approach would be the transitive closure of 
the original graph, which is not effectively rational (and might not even be 
structurally rational) as stated in the remark after Proposition 12. 151 
Following corollary is an obvious consequence of proposition 14.51 

Corollary 4.6. Family TR is closed under inverse morphism. 

Proposition 4.7. Family TR is closed under concatenation, Kleene plus and 
star. 

Proof (Sketch). The argument is more or less the same as for finite automata. 
We use operation over rational relations to get the results. □ 

As stated earlier, we only have now to summary these results. 

Theorem 4.8. The traces of rational graphs, leading from a rational vertex set 
to a rational vertex set, form an AFL ( Abstract Family of Languages). 

Proof. This result is simply a brief summary of corollaries 14.41 14.61 and proposi- 
tions 14.21 and 14. 71 □ 

Now we have an abstract family of languages that contains the context free 
languages. This AFL is a subset of the recursive languages. It seems that this 
family is composed of the context sensitive languages. 

Conjecture 4.9. The traces of the rational graphs are precisely the context sen- 
sitives languages. 

Notice also that recently graphs of linear bounded machines (which charac- 
terize context sensitive languages) have been studied in [K P 99| . 

5 Conclusion 

In this paper, a general family of graphs has been introduced. Rational graphs 
are a strict extension of previously studied families. It is a well grounded family, 
related to well known structures of language theory. We have given both an 
internal and an external characterization, as well as some basic properties. 

Unfortunately, or fortunately depending on the point of view, it is a very ex- 
pressive family. Therefore many decision results are lost. An interesting question 
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is to study restrictions of this family that will retain decision results from former 
families. 

Traces of rational graphs are another aspect of this family. We have shown 
that it forms an abstract family of recursive languages. An interesting question 
is to know if these traces are precisely the context sensitive languages. 

Rational trees also seem to be an interesting field of research, but this has 
not been done yet. 
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Abstract. This paper is a formal study of how to implement interac- 
tion nets, filling an important gap in the work on this graphical rewriting 
formalism, very promising for the implementation of languages based on 
the A-calculus. We propose the first abstract machine for interaction 
net reduction, based on a decomposition of interaction rules into more 
atomic steps, which tackles all the implementation details hidden in the 
graphical presentation. As a natural extension of this, we then give a con- 
current shared-memory abstract machine, and show how to implement 
it, resulting in the first parallel implementation of interaction nets. 



1 Introduction 

Interaction Nets (INs) are an extension of Proof-Nets for the multiplicative frag- 
ment of Linear Logic [^, proposed by Yves Lafont mm as a simple and in- 
herently parallel graphical formalism for programming. By way of a number of 
translations of the A-calculus, interaction nets have proved to be a useful new 
paradigm for implementing functional languages, specifically when controlling 
the sharing of terms is a priority. The research effort has however been directed 
more at these translations than at the implementation of interaction net reduc- 
tion ~ in particular, no parallel implementations exist. In this paper we study the 
sequential and concurrent implementation of interaction nets, by means of ab- 
stract machines in which interaction steps are decomposed into simple machine 
operations. 

The interest of interaction nets for functional programming is twofold: on 
one hand they allow to control the amount of shared reductions performed. In 
particular, they have been used for the implementation of Optimal Reduction 
(as formalized in [T^, and brought to practice in successive studies nolle] m), 
but also other efficient strategies for the A-calculus have been proposed [l3| . 

On the other hand, we have their potential (which has not yet been fully 
explored) to be implemented in parallel: unlike general graph-rewriting systems, 
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interaction nets possess locality and confluence properties which allow for all the 
active pairs in a given net to be reduced simultaneously, without interference. 

Implementations of INs have to this point been quite ad-hoc, since the for- 
malism, being a graphical representation, has always been assumed to be trivial, 
and no formal studies of its implementation have been undertaken. However, 
there is a considerable gap between this presentation and a running implemen- 
tation, in the sense that each basic interaction rewriting step may require many 
‘rewiring’ steps, implemented by non-trivial sequences of machine operations. 

Additionally, a decomposition of interaction steps into sequences of basic, 
close-to-machine operations is essential if parallel implementations are to be 
studied. We may then investigate whether the benefits are limited to the poten- 
tial parallelism contained in the nets (simultaneous redexes) or if there are other 
opportunities for parallelizing, at the level of small-grain machine operations. 

Abstract machines for the A-calculus such as the SECD machine m or Kriv- 
ine’s machine [3] have been proposed as implementation devices encompassing 
(and decomposing) both the /3-reduction relation and the variable substitution 
mechanism. A mechanism along these lines does not however exist for interaction 
nets. In this paper we introduce such an abstract machine, providing a suitable 
decomposition of interaction rewriting steps into fine grain operations. 

From this machine we then obtain a concurrent (multi-threaded, shared mem- 
ory) version as a simple generalization, where basic machine tasks are distributed 
among threads. This may be implemented on any platform offering support for 
multi-threaded computation, although some care is required to guarantee cor- 
rectness. The result is the first parallel reducer for interaction nets. 

Structure of the Paper. We start by briefly reviewing interaction nets and some 
details of their implementation. In SectJ2]we introduce notation and then define 
machine configurations as appropriate tuples of data-structures, and show how 
to obtain a configuration from a net, before giving the definition of the sequential 
abstract machine. Section|l]is devoted to the study of correctness of this machine. 
We then present the concurrent machine in SectjS] In Sect El we mention some 
implementation aspects, notably with respect to the parallel implementation of 
the concurrent abstract machine, and Anally conclude in SectEl 

2 Background and Motivation 

An interaction net is an undirected graph built from a set of cells or agents, each 
of which contains a principal port, and a number (possibly zero) of auxiliary 
ports. Edges in this graph connect any two ports, but no more than one edge 
may be connected to the same port. A free port in the net has a hanging edge 
connected to it, i.e, an edge which is not connected to anything at the other 
extremity. The observable interface of a net is the set of its free ports arranged 
in a sequence. Computation (in the form of graph-rewriting) takes place only 
at special edges of the net, those connecting two cells by their principal ports. 
Such a pair of cells is called an active pair, and it may be rewritten using an 
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Fig. 1. Example interaction rule and its representation using Lafont’s notation 




Fig. 2. Example interaction net 



appropriate interaction rule. An interaction system specifies a set of agents (from 
which to build nets) and rules (with which to rewrite them). 

An Example. The left-hand side of Fig[T] represents an interaction rule in a 
system containing agents (0, S, and -I-) and rules for natural numbers arithmetic. 
The cells are the following: 0 is a single-port agent, representing the constant 0. 
S' is a constructor with two ports, the principal port representing the successor 
of the number in the auxiliary port. Finally, the -I- agent has two auxiliary ports, 
one of which is for the sum of the numbers in the principal port and in the other 
auxiliary port. We define addition inductively on its first argument, so this must 
be associated to a principal port, where interaction is possible. 

The rule is to be applied as a graph-rewriting rule: in a net in which a sub- 
net matching the left side of the rule occurs, this sub-net may be substituted 
by the net on the right. The interface preservation property ensures that there 
is a wire in the right-hand side net to connect to every wire left hanging when 
the sub-net is removed from the initial net. Figure Elis an example of how this 
rule can be applied. We distinguish observable values (a and b) graphically using 
small squares. In the figure, the two active pairs are reduced in parallel, using 
the rule in FigJT] The reader will have no difficulty in writing the interaction rule 
(between the 0 and -I- agents) that would be needed for reduction to proceed. 

The motivations for parallelism can be easily understood. Interaction always 
happens locally, when two cells are connected via their principal ports. The 
rewritable elements are always pairs of cells, so any cell can be involved in at 
most one such element; no critical pairs exist. Any two active pairs in a net can 
be rewritten in arbitrary order; strong local confluence holds, resulting in the 
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fact that the sequences of rewriting steps used in the normalization of a net are 
all permutations of the same set of individual rewrites. 

Implementation Issues. This very simple example raises important questions: 

1. What data-structures are used to represent interaction nets, and how can 
the graphically trivial notion of rewiring be formalized and implemented? 

2. How is the observable interface updated (this is a particular case of rewiring)? 

3. Each application of an interaction rule involves producing a copy of its right- 
hand side. How is this accounted for? 

4. How are active pairs identified? For instance, in the right-hand side of Figl2] 
0 and -I- are connected by their principal ports, but -|- and S, even though 
they are connected, do not form a new active pair. 

5. What are the space and time resources required by each operation? 

6. If the implementation is parallel, what model of concurrency is used? 

7. What are the control mechanisms used for granting correct access to shared 
resources by the different parallel processing elements? 

An abstract machine, by decomposing interaction into atomic operations, should 
provide answers to these questions, and should be directly implementable. 

A Language for Interaetion Nets. The language we use to describe nets was 
originally given by Lafont [8] and developed in |4] . Agents are written as algebraic 
constructors with arity equal to the number of auxiliary ports. Active pairs are 
then equations, equalities between terms with variables. A wire linking two leaves 
(two auxiliary ports) in two such terms (trees) is represented by two occurrences 
of the same variable. Variables are allowed as members in equations, to allow for 
modular descriptions. Each variable occurs exactly twice in the net. 

A net may then be described as a pair (t | A), with t a sequence of terms 
(its observable interface) and A a multiset of equations. With respect to rules, 
each one may be represented succinctly as a net with one active pair (and empty 
observable interface), by wiring together each free port occurring in the left-hand 
side of the rule and the corresponding port in its right-hand side (see Fig.[T]). 

Semantics. We briefly review a calculus for interaction nets [Ij, inspired by the 
Chemical Abstract Machine [2]. Let ^ be the smallest equivalence satisfying the 
structural rules A,t = u,0 ^ A,u = t,0 and A,t = u^v = w,0 ^ A,v = 
w,t = u,0- Af is the function giving the set of variables occurring in a term. 
We give a set of conditional reduction rules for interaction nets. We assume no 
variable occurs simultaneously in a rule and in the net. 

Interaction: {a{t[, . . . , t'^),l3{u'i, . . . , u'^)) is an interaction rule ^ 

(t I Cxifi . . . tjif — fl{u\ . . • Ujnf T) > 

(t I ti — , . . . , t^ — Ui — , . . . , Ujn — ^rm -L) ■ 

Indirection: x G N{u) => {t \ x = t,u = v, F) — > (t | u[t/x] = v,F). 

Collect: X G A/"(t) {t \ x = u, A) — > {t[u/x] \ A). 

Multiset: 0 0', (ti | 0') — > (ta | A'), A' A ^ (ti | 0) — > (ta | A). 
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Properties. Some properties of this system, proved in jl], are strong loeal eon- 
fluenee (diamond property) and uniqueness of normal forms. A standard result 
of (abstract) rewrite systems also yields, as a consequence of strong confluence: 

Lemma 1. Any normalizing interaction net is strongly normalizing. 

Remarks. Each variable occurs exactly once in the term (or list) where it is 
substituted, and no two rules may perform substitution of the same variable. 

The above semantics defines a notion of canonical forms for interaction nets: 
P if Q lA P — !■* Q and Q -f-^. These canonical forms correspond to irreducible 
nets (t I e) or (t | £C), with LC a list of cycles of the form x = t, with x G N{f). 

Structural Equivalence of Interaction Nets. Two interaction nets are a-converti- 
ble, written A B, if they are the same up to renaming of variables. We define 
structural equivalence (=) as satisfying A = (t | Z\) whenever A (t | O) and 
& ^ A. This is clearly an equivalence relation that is preserved by reduction. 

3 A Sequential Abstract Machine for Interaction Nets 

Interaction Systems. An interaction system is a tuple {E,IZ,V), where 17 is a 
set of agents and 7?. is a set of interaction rules. Greek letters a, /3, . . . range 
over agents. is the arity of agent a. V is the set of variables in the system, 
ranged over by x,y..., and which allows one to define the set Terms for this 
system, which are either variables or agent terms of the form a(ti, . . . , tn“), with 
ti, . . . , tn^* € Terms. We will sometimes write this as a(t). 

The function Af : Terms —>■ V{V) returns the set of variables in a term (it 
can be trivially extended to pairs, sequences, and sequences of pairs of terms). 
The ^ operator is applied to a rule to produce a copy of it in which all variable 
names are fresh (w.r.t. a certain context - a machine configuration) and unique. 

We then define T as the subset of Terms in which each variable occurs at 
most once. This linearity condition may be generalized to lists of terms and lists 
of pairs of terms, allowing substitution to be defined trivially as assignment, since 
no erasing or copying of the substituted term will happen: t[u/x\ = t[x := u]. 

Each rule r G 7?. in the system is a tuple r = (ti,t 2 , (fr) where G, ^2 G T but 
ti,t 2 ^ V, TV’(ti) n 7V’(t2) = 0 , and (fr ■ Afifi) U 7V’(72) ^ Afifi) UAf(t 2 ) is a 
fixpoint-free involutive permutation (or involution) on variables, i.e, y = 4>r{x) 
implies x = 4>r{y), and (j)r{x) yf x. We shall denote by 4>r[x ^ y] the permutation 
which maps x to y, y to x, and any other variable 2 to 4>r{z). We require that 
no two interaction rules exist in IZ for the same pair of agents, and also that IZ 
is closed under symmetry so that the order of t\ and t 2 is irrelevant. To write 
rules as given before in this framework, it suffices to give different names to the 
two occurrences of each variable, and store the linking information in cf^.. 

We finally define a set Ta of annotated terms. These are either variables, or 
terms of the form {X}.t with 7 G T, and X a list of variables, possibly containing 
the symbol □. We shall see in Sect JSl how these annotations will be used. 
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Configurations. A machine configuration is a tuple (A | S' | \ V \ C), where 

no variable occurs repeatedly in two different components of the tuple: 

— r : V ^ 7^ is a Heap, a mapping such that x G dom{r) implies x ^ 
{N{r{y)) I y G dom{r)}, and x,y & dom{r) implies N{r{x)) n A/’(r(y)) = 
0. We use the notation r[x ^ {A^}7] to refer to the heap which maps x to 
{-/V}.t and any other variable y to -T(y); 

— S G {Ta X Ta)* is a sequence of Pairs of terms, representing equations; 

— (f>N : N ^ N is a fixpoint-free Involution on variables, with N = Af{S) U 
Af(y) U Af(C) U dom{r) U {Af(S(a:)) | x G dom{r)}; 

— V G {Ta U {□})* is the Observable Interface of the net (a sequence of terms); 

— C G ((V X Ta) U {□})* is a sequence of Cycles. 

Interaction Functions. We will use the usual [. . .] notation for lists, e for the 
empty list, : for cons, @ for append. For an interaction system S, we define: 

Is{a{t),^{u)) = [{iy{ti), i^{tti)) ,... ;/(«„<.)) , 

(i/(ui), i^(uui)) , . . . {l^{Uaf3),iy{uUnf3))]; 
<Ps{a{t),P{u)) = (j>4>r] 

where r = {a{t'), P{u'), fr) G H, f = {a{tti, . . . ttn<^), (3{uui, . . . uu„p), 4>4>r) is a 
fresh copy of r, and v annotates (agent) terms with a sequence of the variables 
occurring in them: v{x) = x, v{a{t)) = {an{a{t)) : □ : e} .a{t), with an defined 
by an{x) = [a;] | an {a{ti, . . . tn<^)) = an{t\) @ @ an{tn<^). We will denote by 

° an auxiliary function on lists for removing the first occurrence of the □ mark. 

Loading the Abstract Machine. In order to obtain initial configurations 27[t | A] 
corresponding to an interaction net (t | A) we first need to obtain a net (t' | A') 
by splitting variables and linking split pairs in an involution Then if[t | Z\] 
is any configuration ( 0 | vip{seq{A')) \ \ ni{t') : □ : e | □ : £ ), where seq{A') 

is a list obtained by arbitrarily ordering the set A', and vi,vip generalize the 
previously defined v for lists of terms and lists of pairs of terms, respectively. 

The basic idea. We describe the machine succinctly: equation pairs are stored in 
the list S. Each execution step pops a pair, and an appropriate rule is selected 
(by pattern-matching) to handle that pair. 

If it is an active pair {a{t), f){u)), an interaction will be performed, invoking 
the interaction functions and pushing the newly generated pairs Ts{a{t), (3{u)) 
onto the stack, and including in the involution the pairs given by I>s{a{t), j3{u)). 

If it is a pair of variables {x,y) such that 4>n{x) = z and 4>N{y) = w, we 
remove from the pairs x ^ z and y ^ w, and add to it the pair z ^ w. 

For a pair {x, ce{t)), we simply create a new entry x a{t) in the heap. 

The machine stops when S is empty. For now we consider that the sequence 
of pairs is accessed using a LIFO strategy, thus as a stack. However it is not 
important how we implement the multiset of equations as a linear data-structure; 
by opting for a stack we simply increase the locality of the machine. 
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Fig. 3. Example of the rewiring involved in the application of an interaction rule 



Rewiring is handled by the two last cases. Using proof-net terminology, a 
cut between two axioms gives origin to a single axiom, whereas a cut between 
an axiom and any other net results in this last net being stored in the heap. 
For this reason, final configurations are totally contained in the heap, since the 
information progressively stored in it has not been used anywhere. We thus 
need a set of post-processing rules that will traverse the observable interface and 
update every variable Xi in it with the value stored in the heap for (j)K[{xi). We 
shall not worry about post-processing here. 

An Example. Let us see how this machine deals with a simplified version of 
the example in Sect| 2 ] First observe, on the right in FigiU the interaction 
rule for the pair (-I-, S) redrawn to reflect our notation for rules, giving ri = 
(S'(-b(a;i,?/i)),-b(a;2, 5(2/2)), {a;i ^ X2,yi ^ 2/2})- 

We show in FigO a net consisting of a single active pair, together with a 
copy of the rule mentioned. The rewiring that needs to be done is shown in the 
picture, and consists in wiring together all the corresponding arguments of each 
agent in the active pair in the net and in the rule (and removing the active pair 
from both). Formally, we start with an initial configuration 

Uo = (0 I [( 5 ( 0 ), -l-(a2, 62))] I {ai ^ 02, 61 ^ 62} | [ai, 61, □] | [□]), 

The first step of the abstract machine produces a new configuration by popping 
the pair from the stack and invoking the interaction functions: 

15(5(0), -b(o2, 62)) = [(62, 5(2/2)), (o2, 2:2), ( 0 , -\-{xi,yi))], 

^5(5(0), -b(o2, 62)) = {xi ^ X2,yi ^ 2/2}, 

resulting in the configuration 

^1 = (0 I [(^2, 5(2/2)), (o2, 2:2), (0, +{xi,yi))\ 

I {ai ^ 02,61 ^ 62,2:1 ^ 2:2,2/! ^ 2/2} I [ai,6i, □]). 

The pair at the top is now made of a variable and an agent term, which will now 
be associated to the variable, in the heap. 

S2 = ({62 1-^ 5(2/2)} I [(02,2:2), (0,-b(2:i,2/i))] 

I {ai ^ 02, 61 ^ 62,2:1 ^ 2:2,2/! ^ 2/2} I [oi,6i, □]). 



274 Jorge Sousa Pinto 



The next pair to be popped has two variables, so we update the involution: 

^3 = ({^2 S{y 2 )} I [(0,+(a;i,?/i))] | {oi Xi,bi ^ 62,2/1 ^ 2/2} I [ai,6i,D]). 

We shall not proceed with the reduction. Instead we show the result of updating 
the Observable Interface of with the values stored in the heap: 

^3 = (0 I [( 0 : +(a^i. 2 /i))] I Wi ^ xi,yi^ 2/2} I [n, oi, 5(2/2)]). 

A Machine with Substitutions. The above machine has some drawbacks that we 
will eliminate by refining its definition. The first problem has to do with cycles: 
in fact, not every pair (x,a(t)) should be moved to the heap, since it may 
happen that 4>n{x) G Af(t). This is an example of a vicious circle, which may be 
generated during reduction. Our machine would store x ^ a{. . . ^ ^Ar(ai), . . .) in 
the heap. Now since 4>n{x) occurs only in the term associated with x, its value 
may never be substituted in the interface, thus this part of the net will be forever 
lost in the heap. Configurations where this happens are called degenerate. 

We must then include a special structure (C) in configurations, to store these 
pairs, and the machine must specify how the test 4>n{x) G Af{t) is performed. 

A different kind of degenerate configuration exists, in which active pairs, 
rather than cycles, are irrecoverably contained in the heap. Suppose for instance 
the stack looks again like {x, a{t)) : {y, j3{u)) '.■■■, but we now have 4>n{x) = y. 
After a first step of operation we get a heap F[x 1 -^ ct{t)] and stack {y, (3{u)) : ■ ■ ■ . 
A second machine step produces a heap F[x 1 -^ ct{t),y 1 -^ /?(«)], and the active 
pair is lost forever in it, since no substitution of a; or 2/ can be made. 

It is easy to show other examples where degenerate configurations are cre- 
ated, either with lost cycles or lost active pairs. We will solve this problem by 
performing substitution of variables stored in the heap during the operation of 
the machine. For every non-active pair at the top of the stack, we will exhaus- 
tively substitute variables before actually popping it. 

Looking again at the previous example, instead of performing the second step 
given above, we would substitute the variable y with the term F[(j)N(y)], to get 
the stack (3{u)) '.■■■, where the active pair has been recovered. 

Implementing Substitutions on the Top of the Stack. In order to allow for the 
exhaustive substitution of variables on the top of the stack to be handled effi- 
ciently, we add to each term an annotation list containing all the variables in it. 
These are included in the configurations when the machine is loaded, and kept 
up-to-date (at a low cost) by every machine rule. Instead of traversing a whole 
term (a tree) looking for variables with values in the heap, we simply traverse 
circularly this annotation structure. The □ mark is initially placed at the end of 
every such annotation list so it can be detected when it has been fully traversed. 

The Abstract Machine. We will now reformulate our configurations by including 
explicitly a processing element or thread. Configurations will be of the form 
{F \ S \ (pN \ V \ C \ t), with t a Thread, built from the following signature: 
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process :Ta xTa ^ Thread enlist : (7^ x 7^)* ^ Thread 
delist : Thread cycle : V x Ta ^ Thread 

Each operator corresponds to a state of the machine, in which it performs one 
specific action. When it is process(t, m), it processes the pair (t, u) using the rules 
previously outlined; when it is delist it will take a pair from the Pairs stack to 
be processed; when it is enlist(Z) it will add to the stack all the pairs in 1; finally, 
when it is cycle(a;,t) it will add the cycle {x,t) to the Cycles structure. The 
abstract machine is loaded with t = delist, and stops with S = £ and t = delist. 

We give in Table [I] the abstract machine rules, where the families of rules I 
to III correspond to pair-processing as sketched before (including variable sub- 
stitution), and Family IV manages the life-cycle of the thread. 

As an example of the flexibility of this presentation of the machine, consider 
what needs to be changed if we want to access the Pairs structure as a Queue: 
rule T.2 simply has to give instead 5@[(t,M)] in its right-hand side. 

A final remark: observe that all the machine rules perform simple tasks such 
as assignments and rotating lists one position, except for rule I, which depends 
of course on the size of the right-hand side of the interaction rule applied. 

Properties. It is immediate to verify the determinism of this set of rules. Most 
of the proofs of properties stated here are left for the long version of this paper. 

Definition 1 (Correct and Complete Annotations). We say a configura- 
tion has correct annotations if for every annotated term {A}.a{t) occurring in 

O O 

it we have set(A) C M{t). It has complete annotations if set (A) 2 



Proposition 1. Let (t | A) be an interaction net. If A[t | A] — A', then S' 
has correct and complete annotations. 

Proof. Straightforward induction on the reductions. A[t | A] has correct and 
complete annotations, and every machine rule preserves the property. □ 



Definition 2 (Degeneracy). A configuration is degenerate if the heap con- 
tains an active pair {a; a{t),y i— > /3{u)} with 4>n{x) = y, or a cycle {xi i— > 
ti}, z = 1 . . . iV, with 4>N{xi) € Af(ti+i) for i = 1 . . . N — 1, and 4 >n{xn) & 



Proposition 2. 7/A[t | A] — S' , (t | A) is an IN, then S' is not degenerate. 
We remark that if S — > S', S' may be degenerate even if S is not. 



4 Correctness of the Sequential Abstract Machine 

Our first goal will now be to define the interpretation of a machine configuration 
as an interaction net. We will then introduce some notation and give several 
lemmas needed for the correctness proof. 
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Table 1. Abstract machine rules 
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Permut 


4>n 


(pN (a{t),l3{u)) 




Thread 


process {{h}.a{t),{l2}-l3{u)) 


enlist {Is {a{t),P{u))) 



II. 1 


Permut 


(Pn[x ^ y] 


(pNlx ^ y] 




Thread 


process(a;, y) 
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II.2 


Heap 
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r 




Permut 


(pN [x ^ z] 


<pN 




Thread 


process(x, y) 
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II.3 


Heap 


r [z ±,w 1-^ {y}.r(tt)] 


r 




Permut 


(Pn[x ^ z,y ^ w] 


<Pn[x ^ z] 




Thread 


process(a;, y) 


process {x, {T}.T(tt)) 


II.4 


Heap 


T [a T, w T] 


r 




Permut 


(Pn[x ^ z,y ^ w] 


4>n\z w] 


y ^ z, X ^ w 


Thread 


process(x, y) 


delist 



HI.O 


Thread 


process {{Y}.a{t),x) 
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r 
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r 
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process (x, {y : T}.a{t)) 


(Pn[x ^ y] 

cycle {x, {y : T}.a{t)) 
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Heap 

Permut 

Thread 
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(pN [X ^ Z] 

process {z, {□ : T}.a{t)) 


r 

<pN 

process {{X}.fi{u), {□ : T}.a{t)) 
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Thread 
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(pN [x ^ z] 

process (z, {□ : T}.a(t)) 


r [z^ {T@[n]}.a(t)] 
(Pn[x ^ z] 

delist 



T.l 


Pairs 

Thread 


{t, u) : S 

delist 


S 

process (t, u) 


T.2 


Pairs 


S 


{t, u) : S 




Thread 


enlist((t, u) : T) 


enlist(T) 


T.3 


Thread 


enlist(e) 


delist 


T.4 


Cycles 


C 


(t, u) : C 




Thread 


cycle(t, u) 


delist 
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Definition 3 (Updating). Let S = {F \ S \ (f>N \ V \ C \ t) . The update of 
S is the eonfiguration U[Z'] that results from recursively substituting (updating 
annotations) every variable x occurring in any component S, V, C, t of E with 
the value stored for the variable 4 >n{x) in the heap F, removing this entry from 
the heap and the pair x ^ 4 >n{x) from the permutation. 

It is straightforward to expand the post-processing rules with others for up- 
dating the stack and the thread. Together they implement the above notion of 
updating. 

Proposition 3. If E is non- degenerate then U[T'] has an empty heap. 

Definition 4 (Collapsing). From an interaction net N = {t \ A) and an 
involution <j) we obtain the collapse of N by (j>, a net denoted by Clp[A^, ^], by 
substituting in t and A every pair of variables x, y such that 4>{x) = y with a 
fresh variable, which we will by convention call kxy. 

Auxiliary Functions. We need a family of auxiliary functions defined as follows: 
[Jo takes a term and removes its annotation, [a;Jo = x | [{fV}.a(t)Jo = 

[J 1 takes a list of terms and removes the □ mark as well as all the annotations, 
[eji = £ I [□ : tji = [tji I \h ■. t\x = [h\o : [tji; [J2 takes a list of pairs of 
terms and returns the corresponding multiset of equations, [£(2 = 0 | [^ : tj2 = 
[t\2 I l{hi,h2) ■ t\2 = {[^ijo = L^ 2 jo} u [t\2. Finally, [Jt takes a thread and 
returns a set containing the pair being processed by the thread, or the empty 
set if the thread does not contain any pair, [process(ti, ^ 2 )]* = {[^ijo = L^ 2 jo} I 
[delist] i = 0 I [enlist(^)Ji = [ZJ 2 | [cycle(a;, t)J t = {x = [t 2 jo}- We shall use the 
notation [J for any of [Jo, [Ji, [J 2 , [Ji, whenever the distinction is clear from 
context. We also need an auxiliary function rn to rotate a list until □ is at the 
end, easily defined as: rn(D : t) = t@(D : £) | rn{h : f) = rn{t@{h : e)). 

Definition 5 (Interpretation). The interpretation of a machine configuration 
E is an interaction net iFf] obtained as follows: 

F We compute U[r] = | | \ | | 

2. We then build the net = ([rn(U'^[^l)J | [S'^I^lj U [C'^l^lj U [t'^l^lj). 

5. [ill = Clp[fV^, concludes our construction. 

Notice that the interpretation of a configuration is unique. It is immediate to see 
that all the configurations Ffjt | A] have the same interpretation (t | Z\). 

The interpretation of a degenerate configuration disregards all the active 
pairs and cycles contained in the heap of a degenerate configuration. This is 
appropriate in the sense that this is information that cannot be read back. 

Lemma 2. If E is an irreducible configuration, then IN'] is in normal form. 

Proof. Immediate. A^ may be empty (if is), or contain cycles. □ 
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Notation. In what follows we do not distinguish notationally between the reduc- 
tion — > (or its transitive — or refexive — >= closures) of machine configu- 
rations and of interaction nets, since the distinction can be made from context. 
The same is true for structural equivalence (=). 

Lemma 3. Let S he a eonfiguration with eorrect annotations, E — > E' using 
any rule except I, II. 4, and III. 5, and E, E' non-degenerate. Then lif] = |L’']. 



Lemma 4. Let E — > E' , with E, E' non-degenerate and correctly- annotated. 
Then [If] [i7']. 



Lemma 5. The set of rules III to IIS, III.O to III. 4, and T.l to T.4 is nor- 
malizing, with the same normal forms as the complete set of rules (i.e, they have 
an empty S component and delist thread), together with all the configurations to 
which one of the rules I, II. 4, or III. 5 can he applied. 



Proposition 4 (Correctness). Let = (t | Z\) he an interaction net. Then 

N^N iff E[t\ A] S, 

with E an irreducible machine configuration, and lif] = N . 

Observe that a result such as lif] — > lif'] E — E' does not hold, since 
the semantics is non-deterministic and besides, it allows for variables in the 
observable interface to be updated at any time. 

5 A Concurrent Abstract Machine 

The machine we have been considering is inherently sequential: it is always 
deterministically decided which rule is applied at each stage. However, the way 
the machine has been formulated makes it trivial to obtain a concurrent machine 
from it, simply by including more processing threads within a configuration. 

Definition 6 (Multi-thread Configuration). An n-thread configuration is: 

{r\S\c)N\V\C\[h,...tr^]) 

where [h, . . .tn] is a list of threads and all other components are as before. 



Definition 7 (Concurrent reduction). 



is the smallest relation verifying 



{r\s\<)N\v\c\ti)^{f\s\(fN\v\c\u) 

{t\s\c)n\v \c\[h...u...tn] )^{r\s\^\v \d\[h...u...tn] ) 
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This definition is intuitive: a pool of threads work on the shared data-structures, 
each thread proceeding individually and deterministically, according to the se- 

ct 

quential machine rules. Each — > step is a step of one of the individual threads. 

Non-determinism is introduced because it may be the case that several threads 
are willing to operate. For instance, when several threads are in the delist state 
and the list is not empty, one of them will fetch the top pair and change state 
to process, while the others will keep trying to access the list. 

The resulting machine almost falls into the standard producers-consumers 
synchronization model of shared-memory computation, with the synchronization 
problem solved by implementing a shared queue of tasks (pairs to be processed) . 
The diference is that every thread is a consumer and may be a producer, of tasks. 



Properties. The definition of the reduction of a multi-thread configuration as 
individual reductions of any of its threads makes all the inductively proved prop- 
erties of the sequential machine also valid for the concurrent one. Lemmas E] E] 
m and El and Propositions [H and El all hold, with the interpretation of a multi- 
thread configuration modified to include all the pairs being processed by threads, 
and irreducible configurations having an empty Pairs list and all threads delist. 

Unfortunately, Prop[^is no longer true, which destroys our correctness result. 
This is due to a race condition when traversing an annotation list to perform 
substitutions, which allows the generation of irrecoverable cycles in the heap. 

As an example of such a situation, consider the following 2-thread configu- 
ration (we omit the V and C components), for a net containing a 2-cell cycle 
which we show may get lost in the heap. Consider that w G J^{t) and z G Af{u). 



^T.l 

^ 111.2 



* 111.2 



* 111.5 



* 111.5 



(0 I {x, [w : n].t) : {y, [z : □].u) : S' | (/>Ar[x <-*■ z, y ^ w] | [delist, delist]). 

(0 I S I 0Ar[a; ^ z, 2/ ^ w] | [process(a;, [w : nj.t), process(?/, [z : □].u)]). 

(0 I S I 0Ar[a; ^ z, 2/ ^ w] | [process(a;, [w : nj.t), process(2/, [□ : z].u)]). 

(0 I S I ())Ar[a; ^ z, 2/ ^ u>] | [process(a;, [□ : w].t), process(2/, [□ : zj.u)]). 

({a: I— > [w : □j.t} \ S \ 4>n[x z,y ^ w] \ [delist, process(2/, [□ : zj.rt)]). 
{{x ^ [w : nj.t, 2/ [z ■ nj.u} I S I (j)N[x z,22 ^ w] | [delist, delist]). 



Recovering Correctness. First, we remark that this form of degenerate configu- 
rations is not very harmful, since one could simply forbid nets containing cycles. 
However, a simple way exists to recover lost cycles, by keeping in the config- 
urations an additional component: a list of variables stored in the heap, kept 
up-to-date by the rules. After post-processing, and according to Prop[3] (remem- 
ber post-processing rules implement updating of configurations with empty lists 
of Pairs), this list will be empty if the configuration is not degenerate. If it is 
not empty, then the machine may be restarted with any pair (variable, term) 
still stored in the heap. Once a pair in a cycle is recovered in this way, regular 
machine operation will proceed to store the cycle in the appropriate structure. 
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6 Implementing the Abstract Machine 

Sequential Implementation. The abstract machine may be programmed sequen- 
tially in a straightforward way. It suffices to choose appropriate data-structures 
for configurations, and to map machine operations into these structures. The 
‘motor’ that runs the machine is simply a loop that pops a pair of terms from 
the stack and decides which machine operation to apply. We remark this decision 
is not complex, as the number of operations might lead one to think: the form 
of the pair being processed restricts the pattern-matching to a family of rules. 

Optimizations. During operation of the machine the size of Pairs tends to grow 
considerably. One possible optimization with respect to keeping the list reason- 
ably small is to give priority to the execution of rules that remove pairs from it. 
This is a straightforward modification: it suffices to add a second list of pairs to 
the configurations (for active pairs), and to substitute rules T.l to T3. by a new 
set of rules that manages the two lists according to the desired priority scheme. 

Concurrent Implementation. A well-suited technology exists for implementing 
our concurrent abstract machine: POSIX Threads, which are lightweight pro- 
cesses running in the address space of the same UNIX process, individually 
scheduled. This technology has the advantage of producing implementations that 
may be run (without modifications in the code) in machines with any number 
of processors. If conveniently supported by the kernel of the operating system, 
threads will run in true parallelism, assigned to different processors. 

The way to implement our abstract machine using this technology is to launch 
as many threads (running the same code, as specified by the machine rules) as 
there are processors in the machine (this can be done automatically at run-time) . 

Now it is time to ask whether the correspondence between the resulting 
implementation and the abstract machine is perfect. In a uniprocessor machine 
this is indeed the case: a single thread is executed at any time, and this is 
what the machine captures, even though in practice each machine operation 
is decomposed into a sequence of program instructions, which means that time- 
slicing between threads may occur in the middle of the execution of an operation. 

In the case of multiprocessors, however, in order to obtain a correct imple- 
mentation, one has to stipulate that a parallel reduction step is any sequence of 
concurrent reduction steps in which every thread is involved at most once. 

To illustrate this point, consider the situation of two threads in the delist 
state. The outcome if both execute rule T.l simultaneously is not predictable. 
The correct behaviour is that two different configurations may result, corre- 
sponding to the two orders in which the threads may take pairs from the list. 

Now the parallel reducer must keep to this behaviour, protecting the access 
to the Pairs and Cycles data-structures by means of some synchronization mech- 
anism: the first thread to gain access to a shared structure will make the other 
thread(s) block until the first one liberates the structure. Locks [ 7 ] are used to 
implement a linearizable queue, and this is thus not a wait-free implementation. 
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We leave a detailed treatment of this matter for the long version of this 
paper; there is another situation requiring locking, which concerns the case of 
two adjacent pairs of variables (two axiom cuts) being processed simultaneously. 

Other situations are critical with respect to race conditions in true paral- 
lelism. Take for example the two threads process(a;, {□ : T}.t) and process(y, {□ : 
T'}.u), with 4>n{x) = y. Our abstract machine prevents this from generating a 
lost active pair in the heap: once one of the threads executes rule III. 5, the other 
will be unable to execute it (executing III. 4 instead). But in true parallelism the 
threads may execute III. 5 simultaneously, giving a degenerate configuration. We 
remark that the problem now is not in the access to the data structures. 

One solution to this problem not requiring additional synchronization is the 
mechanism explained in Sect[^to recover entries in the heap back to Pairs. This 
optimistic solution relies on the low probability of the race situations to occur. 

7 Conclusions 

We have presented an abstract machine which provides answers to the questions 
raised in SectEl Specifically, the machine proposes concrete data-structures, to- 
gether with an algorithm for implementing interaction net reduction. We have 
also specified how rules are represented and applied. Identification of active pairs 
is automatic, and question 2 is answered by the post-processing operations. With 
respect to 5, all the machine rules perform simple operations in constant time. 

We have implemented this machine following Sect El and obtained a robust 
reducer that performs well. The granularity of the machine operations is quite 
small, while still keeping the operations sufficiently atomic: our tests show that 
each interaction is decomposed in between 7 and 12 machine operations. 

The concurrent machine additionally provides answers to questions 6 and 
7, resulting in the first parallel implementation of interaction nets, with the 
significant advantage of running on commonly available workstations. We remark 
that this parallel implementation is not limited by the number of active pairs 
in particular nets, since all rewiring operations are performed in parallel. Our 
current research involves studying the performance of this implementation. 
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Abstract. This paper studies the relationship between synchronous and 
asynchronous mobile processes, in the setting of the 7r-calculus. A type 
system for processes of the asynchronous monadic subcalculus is intro- 
duced and used to obtain a full-abstraction result: two processes of the 
polyadic 7r-calculus are typed barbed congruent iff their translations into 
the subcalculus are asynchronous-monadic-typed barbed congruent. 



1 Introduction 

This paper studies the relationship between synchronous and asynchronous mo- 
bile processes, in the setting of the 7r-calculus |MPW92l IMil99j . A primitive of 
the TT-calculus, inherited from its ancestor CCS is a form of handshake 

communication. The (polyadic) rr-term x{a\a 2 )- P expresses a process that may 
send the pair of names ai, 02 via the link named x and continue as P, and the 
term x{yiy 2 )- Q a process that may receive a pair of names via x (a reader unfa- 
miliar with TT-calculus may care to refer to section 2). Interaction between these 
processes is expressed by 

x{aia2). P I x{yiy2)-Q — > P \ Q{°-^°-^/yiy2} , 

where indicates substitution of the as for the ys in Q. The fact 

that interaction is expressed by handshake communication is important for the 
tractability of the 7r-calculus, and that of many other theories of concurrent 
systems that are based on communication primitives of a similar nature. 

On the other hand, many concurrent systems, especially distributed systems, 
use forms of asynchronous communication, in which the act of sending a datum 
and the act of receiving it are separate. Relatedly, many languages for pro- 
gramming concurrent or distributed systems have asynchronous primitives, an 
important reason for this being that they are amenable to efficient implementa- 
tion. Language features for synchronized communication are often implemented 
using asynchronous primitives. 

The TT-calculus has a subcalculus in which communication may be understood 
as asynchronous |HT91l IBoii92] . The key step in achieving this is the decree that 
in the subcalculus, the only output-prefixed terms are those of the form x{a). 0, 

* This work was done while the author was at BRIGS, Aarhus University, Denmark. 

J. Tiuryn (Ed.): FOSSACS 2000, LNCS 1784, pp. 283- f2TO 2000. 

(c) Springer- Verlag Berlin Heidelberg 2000 
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where 0 expresses a process that has no capabilities. In a term of the subcalculus, 
a subterm x{a). 0 at top level may be thought of as a datum a that has been 
sent but not received and that is available to any subterm at top level of the 
form x{y). Q. 

The theory of the asynchronous subcalculus is much less tractable than that 
of the TT-calculus. It is the subcalculus, however, that is the basis for the con- 
current programming language Piet [P Th?] . Further, the join-calculus |F(I96j . 
which is itself closely related to the asynchronous 7r-calculus, is the basis for a 
language for programming distributed systems |FM97j . Language features for 
synchronized communication are implemented in many languages, among them 
Piet and the join-language, by means of compilations based on well-known asyn- 
chronous communication protocols. The first study of such a compilation carried 
out using a mathematical model of mobile processes was in [Pou92j . where a 
specific translation was shown to be computationally adequate with respect to 
Morris’s extensional operational preorder. The present paper studies the trans- 
lation of |Boii92| . but extended in a straightforward way to polyadic 7r-terms. 
Thus the translation considered is from pvr, the set of polyadic 7r-terms, to am7r, 
the set of asynchronous monadic 7r-terms. 

The paper is concerned with the effect of the translation on behavioural 
equivalence. The standard equivalence on 7r-terms is barbed congruence. 
Roughly, two terms are barbed congruent if no difference in behaviour can be 
observed between the systems obtained by placing them into an arbitrary tt- 
context. The notion of observation is natural, and the basis for observation of 
difference in behaviour is a kind of bisimulation game. The definition of barbed 
congruence, involving as it does quantification over a class of contexts, is also 
appropriately sensitive to the (sub)calculus under consideration. For instance 
on polyadic terms a typing discipline is used to separate out ill-formed terms 
such as x{a\a 2 a^) ■ P \ x{yiy 2 )-Q, in which sender and receiver disagree on the 
length of tuple to be communicated. Given a sorting A of the kind introduced 
in |Mil91j that achieves this separating out, polyadic terms that respect A are 
barbed A-congruent if no difference in behaviour can be observed between the 
systems obtained by placing them into an arbitrary context that itself respects 
A. 

The translation from pn to amir is not fully abstract. The reason, briefly, is 
that there are aniTr-contexts that do not conform to the protocol underlying the 
translation, and some such contexts are able to expose differences between the 
translations of p7r-terms that are barbed A-congruent. The aim of this paper is to 
obtain a full-abstraction theorem for the translation by giving a characterization 
of a suitable class of omTr-contexts. The basis for the characterization is a type 
system for amTr-terms. This type system is based on a graph, derived in a simple 
manner from the sorting, that describes aspects of the protocol that is at the 
core of the translation. The full-abstraction theorem asserts that for any sorting 
A, two pTT-terms are barbed A-congruent if and only if their translations are 
barbed A^'^-congruent. Barbed A“'"-congruence is the natural variant of barbed 
congruence on we 11- typed amTr-terms. 
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The study of type systems for mobile processes is an important topic, both to 
aid rigorous analysis of systems and to assist in programming. The present paper 
develops techniques introduced in [QW9^ and builds on earlier work on types 
for mobile processes, for instance in IHon93l KPTMI IPSW] ISan97l [Til^ . 

The paper |QW98| proves a full-abstraction theorem for a translation from p-K 
to the monadic 7r-calculus. The present paper studies a translation from pir to 
the asynchronous monadic subcalculus. Because of the separation of sending and 
receiving in the asynchronous subcalculus, the technical details in the present 
paper differ greatly from - and are considerably more difficult than - those 
in [QW98| . We wish to mention in particular the paper IYos96l , where a notion 
of graph type for monadic processes is introduced and studied. Nodes of a graph 
type represent atomic actions, and edges an activation ordering between them. 
Although [Yos96J and the present paper both use graphs for similar purposes, 
the technical developments in the two papers are entirely different. 

We believe that the present paper is the first to prove a full-abstraction 
theorem for the translation studied in [Bou92J . The present paper studies the 
translation extended to polyadic terms in order to show the generality of the 
techniques used. The type system introduced is of a kind that is well under- 
stood, and its rules are, in our view, informative. One or two of the rules are 
quite complicated, however. We believe that this complication may be intrinsic; 
we tried many alternatives before reaching the system described here. In the 
space available it is not possible to explain why simpler-looking systems are in- 
adequate. Instead, we explain the main features of the system, using examples 
where appropriate, and give enough of the technical development to outline the 
structure of the argument. An important point is that the class of typeable amir- 
contexts contains much more than just the translations of well-sorted polyadic 
contexts. 

Several papers study translations between subcalculi of the 7r-calculus. For 
instance, in addition to |Bou92] mentioned earlier, |Bor98] studies a translation 
from the asynchronous pn to the subcalculus of the 7r-calculus, ttI, in which 
only private names may be communicated. Also, [MS98] studies in depth a sub- 
calculus, Ltt, of the asynchronous monadic calculus in which only the output 
capability of names may be communicated. In particular a full-abstraction re- 
sult is shown for the translation from Lit to the subcalculus LttI of ttI. The 
subcalculus Ltt is closely related to the join-calculus, about which many results, 
including results on translations, are shown in |Fou98| . 

In the papers just mentioned, a summation operator on terms is lack- 
ing. Encodings of asynchronous processes involving forms of summation in the 
summation-free subcalculus, another topic that is important in programming- 
language implementation, are studied in [NP96| INes97] . Summation is also ab- 
sent from the calculus considered in the present paper. In jPa M!, a result is 
shown that is paraphrased (Remark 6.1, p. 264) as follows: “There exists no uni- 
form encoding of the 7r-calculus [with guarded summation] into the asynchronous 
TT-calculus preserving a reasonable semantics” , where uniform and reasonable are 
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given specific meanings. The translation of jBoii92j is uniform and the semantics 
studied in the present paper is reasonable, in the senses of [Pal97j . 

In our view, the main theorem of the present paper is of intrinsic interest: it 
shows precisely how the class of amTr-contexts needs to be cut down to obtain 
full abstraction. Moreover, the characterization uses standard machinery of type 
systems, albeit with some inevitable complication. Further, the characterization 
is closely tied, via the type system, to the protocol that is at the core of the 
translation: it shows clearly how that protocol affects process equivalence. Fi- 
nally, because of the importance of type systems for mobile processes in general, 
we believe that several ideas needed for the proof of completeness may be useful 
for tackling other problems. 

In section 2 we recall necessary background material, in section 3 we introduce 
the type system for asynchronous monadic processes, and in section 4 we briefly 
outline the proof of the main result. 



2 Background 

In this section we recall necessary definitions and notations. We refer to the 
papers cited in the Introduction for further explanation. 

We presuppose a count ably-infinite set, N, of names, ranged over by lower- 
case letters. We write x for a tuple a;i . . . a;„ of names. 

The prefixes are given by 



7T ::= x{y) \ x{z) 

where 2 ; is a tuple of distinct names. In each case, x is the subject. 

The (polyadic) processes are given by 

P ::= 0 I TT.P I P\P' I vzP I !P. 

We write pir for the set of processes, and use P,Q,R to range over pn. A process 
is monadic if for each subterm x{y). P or x{y). P of it, y is of length 1. A process 
is asynchronous if for each subterm x{y). P of it, P is 0. We abbreviate x{y). 0 to 
x{y). We write amir for the set of processes that are asynchronous and monadic. 

In x{z). P and in vz P the displayed occurrences of z and z are binding with 
scope P. An occurrence of a name in a term is free if it is not within the scope 
of a binding occurrence of the name. We write fn(P) for the set of names that 
have a free occurrence in P, and fn(P, Q, . ■ .) for fn(P) U fn(Q) U . . .. 

We write {V^ ■ ■ ■ V-n-jxx . . . x„} for the substitution that maps Xi to yi for each 
i and is otherwise the identity, and P{V^ ■ ■ ■ Vn/xi . . .x„} for the process obtained 
by applying it to P, with change of bound names if necessary to avoid captures. 

We adopt the following conventions on bound names: processes that are a- 
convertible are identified; and when a collection of processes and other entities 
such as substitutions or sets of names is considered, it is assumed that the bound 
names of the processes are chosen to be different from their free names and from 
the names of the other entities. 
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A context is obtained when the hole [•] replaces an occurrence of 0 in a 
process. We write C[P] for the process obtained by replacing the occurrence of 
the hole in the context C by P. 

Structural congruence is the smallest congruence, =, on processes such that 

1. Pi I (P 2 I P 3 ) = (Pi I P 2 ) I P 3 , Pi I P 2 = P 2 I Pi, P I 0 = P 

2. vz vw P = vw vz P, vz 0 = 0 

3. vz (Pi I P 2 ) = Pi I P 2 provided z ^ fn(Pi) 

4. !P = P|!P. 

Reduction is the smallest relation, — >, on processes such that 

1. x{y). P \ x{z).Q — >P\Q{y/z} provided I y 1 = 1 2 : 1 

2. P — > P' implies P\Q — > P' \ Q 

3. P — > P' implies vz P — > vz P' 

4. P = Q — >Q' = P' implies P — > P'. 

If a; is a name then a; is a co-name. The observability predieates, [x-, are 
defined by: P if P has an unguarded subterm x{y).Q with the displayed 
occurrence of x free in P. (An occurrence of a term is unguarded if it is not 
underneath a prefix.) 

Barbed bisimilarity is the largest symmetric relation, fi, such that if P « Q 
then P implies Q — >*ix, and P — *■ P' implies Q — P'. The normal 
definition of barbed bisimilarity on 7 r-calculus requires also that: P implies 
Q — >*lx- On the asynchronous subcalculus, however, it is normal to take the 
observables to be the co-names only. On the full calculus, closing under contexts 
one obtains the same relation (barbed congruence) whether or not names are 
deemed observable. We therefore work with the definition as stated. 

Now fix a set S of sorts, ranged over by s,t, and a sorting X : S ^ , where 

is the set of nonempty tuples of sorts. We use 'P to range over finite partial 
functions from names to sorts. We write n{P) for the domain of P, and x : s for 
{x, s). In particular, if n(if') = {x \, . . . , Xn} and P{xi) = Si for each i, we write 
{x\ : si, . . . , a;„ : s„} for P. We write P, P' for P\JP' , provided x G n(if') n n(<f'') 
implies P{x) — P'{x); and we write P,x : s for P,{x : s}. Note: we will later 
consider functions from names to other sets, and will then use similar notations. 

P is a X- process if a judgment P P can be inferred using the rules in 
table [T1 where the prefix rules share the side condition that A(s) = (P . . -tn), 
and the output-prefix rule has in addition the side condition: x = yt implies 
s = ti, and yi = yj implies ti = tj. In accordance with the convention on bound 
names, in writing S' h P it is assumed that the bound names of P are chosen 
to be different from the names inP . So in particular z ^ n(lf') in the restriction 
rule, and zi, . . . , z„ ^ n(if') in the input-prefix rule. 

Basic properties of the type system are: preservation of typing under =, 
subject reduction, and freedom from arity-disagreements: 

Lemma 1. 1. If S' h P and Q = P, then P \- Q. 

2. If S' h P and P — > P', then P h P'. 

3. If P is a A-process and P — >* vw (x{y).Pi \ x{z).P 2 \ P 3 ), then |y| = |z;|. 
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The appropriate barbed congruence for A-processes is defined as follows: 

Definition 2 . 1. A context C is a context if for some W there is an 

inference of h C in which the hole is typed hy W \- [•]. 

2. P and Q are barbed X-congruent, P Q, if there is W such that W \- P,Q 
and C[P] « C[Q] for every A(tf')-context C. 

3 The Asynchronous Monadic Type System 

Definition 3 . The translation |-] from pn to amn is defined by the following 
clauses (where for clarity we give the clauses for triples; the general case is then 
as expected), together with the stipulation that |-] is a homomorphism for the 
other operators: 

\x{aia 2 az) . P} = vw (xw \ w{vi). {Wai \ w{v 2 ). (U 2 O 2 | w{v3). (mm \ I-Pl)))) 
Ix{yiy2y3)- Pj = x(w).nvi (wvi \ vi(yi).nv 2 (wv 2 \ V2(y2)-nv3 (wv3 \ V3(y3)- [-P1))) 

where w, ui, W 2 , us ^ fn(|P]). 

Communication of an n-tuple in pn is mimicked by a sequence of 2n + 1 
reductions in amn. In the case n = 2 we have: 



[*(aia 2 ).-P I x(yiy 2 ).Qj 

— > mu (w(vi). (vTai \ w(v2). (ma2 \ [P])) | nvi (wvi | vi(yi).nv2 (wv2 \ V2(y2)- IQ]))) 
— > munvi (Wai \ w(v2). (ma2 \ [P]) | vi{y\).vv2 (wv2 \ V2(y2)- [Q])) 

> vw (w(V2). (W2O2 I IP|) I VV2 (wV2 \ V2(y2) ■ [Q]{®Vl/l})) 

> VV2 (W202 I [P| I V2(y2). IQ]{“Vyi}) 

^ IP] I [Q]{«i«Vj/iy 2 }. 
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The first reduction, via x, establishes a private link w between sender and re- 
ceiver. In the second reduction the receiver uses w to transmit a private link vi, 
and in the third the sender transfers ai along vi. The receiver then transmits 
another private link V2 via w, and in the last reduction the sender transfers 02 
along V2- This completes the protocol for communicating the pair 01,02 via x. 
In a slight variant of the protocol, the private name vi is used to send both oi 
and 02- Although we have not studied this variant, we imagine that a similar 
analysis may be carried through. 

We refer to names, such as w in the example above, that are passed in the 
first step of the protocol as primary names; and we refer to names, such as v\ 
and V2 in the example, that are used to pass names that occur in the process 
being translated as secondary names. We introduce m-sorts for classifying these 
names, and a graph, both derived from the sorting A. 

Definition 4. The set of m-sorts is S™ = 5™ U 5™ where the set of primary 
m-sorts is 

5™ = {o® I s e 5} U {s* I 1 < i < I A(s) I, s G 5} U {•} 

and the set of secondary m-sorts is 

= {o®‘ I 1 < z < I A(s) I, s G 5} U {,5®’ I 1 < z < I A(s) |, s G 5} . 

We use a to range over primary m-sorts, S to range over secondary m-sorts, a 
to range over m-sorts, and o to range over the o® and the o® . 



Definition 5. The labelled directed graph Q\ has set of nodes 5™ and arrows 
as follows, where A(s) = (ti . . . t„): 



O® _> ^ g2 ^ 



where Si is i5® . 



We use the following notations: for 3cr, tr'.cr ct'; ct for Ba'.a' cr; 

a a' for 35, t.a ^ a'; cr+ for the a' such that a a' , provided cr 7 ^ •; and 
CT++ for the tr" such that a a' ^ a”, provided a, a' ^ •. 



Referring to the example above, suppose A(s) = {t\t2) and x is of sort s 
and ai is of sort ti for each z. The type system will assign different primary 
m-sorts to various occurrences of the primary name w to capture their different 
roles in the translated process. The occurrence in xw will be assigned primary 
m-sort o®, those in wv\ and w{y\) primary m-sort s^, and those in wv2 and 
w{y2) primary m-sort s^. The name w is first carried by x, then carries v\, then 
carries V 2 , and finally disappears (represented in Q\ by •), having completed 
its contribution to the protocol. A secondary name is sent and then used once 
for communication: for each z the occurrence of the secondary name Vi in wvi 
will be assigned secondary m-sort o® , and the occurrences in uiai and Vi{yi) 
secondary m-sort 5® . Note that v\ carries ai of sort t\ and V 2 carries a 2 of 
(possibly different) sort t 2 - This information is recorded in the labels of Q\. 
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In general the m-sorts assigned to names in a type judgment will give infor- 
mation about how the names occur in the process in question. The judgments 
are of the form W; A; F; f2; II h M with M an amTr-process. The function F 
associates sorts with names, as in the type system for pir. The functions A and 
r associate m-sorts with names that occur free at the top level in M. More 
precisely, A gives information about input prefixes of the form x{y), and F in- 
formation about output particles of the form xy. Further, fl gives information 
about free names in prefixes not at the top level in M, and 77 certain associations 
between names. We explain this further after introducing the type system. 

Notation 6. We use A, F to range over finite partial functions from N to 5™. 

We use fl to range over finite partial functions from N to 5™ U x N). 
We write 12^ for the function with domain n(l7) — {z} such that for x G n(l7*), 

if I7(a;) = (cr, z) 

\ fl{x) otherwise. 

We use 77 to range over finite partial functions from N to N. We write IF^ 
for the function obtained from 77 by deleting any pair in which z occurs. 



Definition 7. A7 is a \°^'^-process if a judgment F; A; F; fl; II h M can be 
inferred using the rules in table El where: 

1. In accordance with the convention on bound names, in writing 
F; A; F; fl; n h M it is assumed that the bound names of M are cho- 
sen to be different from the names in n(F, A, F, 17, II). So in particular 
z ^ n{F, A, F, 17, 77) in the restriction rules, and w ^ n(iT') in F, and v ^ n(l7) 
in iyj, and a ^ n{F,w) in F- 

2. ix is two rules written as one: if (o®)++ = • then {w : (o®)"'”'’} is read as 0. 

3. iu] is two rules written as one: if = • then {w : cr'*"} is read as 0. 

4. iy is three rules written as one: if cr = • then {w : a} and {w : a~^} and 
{w : (cr, z))} are all read as 0; if cr yf • but cr+ = • then {w : cr+} is read as 0. 

5. The side condition (comp) of par is: Z\i, T^i, l7i, 77i and A 2 , F 2 , F72, II 2 are 
complementary, as defined below. 



Definition 8. A\, F\, f7\, FIi and A 2 , F 2 , F72, FI 2 are complementary if 

1. n(Z\i) n n(Z\ 2 ) = n(ri) n n(T 2 ) = n(l7i) n n(J72) = 0 

2. Z\i, Z\ 2 ; 7^1, 7 ^ 2 ; 7^1, 1 ^ 2 ; 77i, 772 are compatible, where 



Definition 9. A, F, F7, 77 are eompatible if 

1. if a; S u(Z\) n n(T^) then A\x and 7^ [a: are a:-partners (see below) 

2. if a; € n(l7) n n(T^) then x € n(77) and f7(x) = (F(x)'^, H(x)), or x ^ n(77) 
and I7(x) = F(x)~^ 
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0; 


0; 


0; 


0 h 0 


X 


: s; 


0; 


{w 


: 0 } 


; 0; 0 h xw 




0; 


{in : 


: a, V 


:o"} 


; 0; {(in, n)} h uJn 












St 


<I',a 


: t- 


0; 


{v : 


5}; 


0; 0 h na 




0; 


{in : 


:K) 


+}; 


{in : ( 0 ")++}; 0hM 




'I', X 


: : s; 


0; 


0; 


0; 0 h x(w). M 




{w : 




i; {v ■■ 5‘ 


"}; 0; 0hM 




{w 


: cr} 


; 0: 


; 0; 


0 h in(n). M 


'I', a 


: t- 


0; 


{in 


: a}: 


; {in:cr+}; 0 h M 


>F- 


{v : 


5}; 


0; 


{in 


: (cr, n)}; 0 h v{a). M 


>F- 


Ai-, 


Ti 


- ? ) 


III h Ml I'; A 2 ; 



w ^ x) 

w,v ^ n(^) w ^ v 
and V ^ a) 



w ^ n(^) 



a and v ^ n(!^', w) 



9- Ai,A 2; ri,ra; ihjh', ili , TTa h Mi | M 2 



I',z 


: s; 




T; 


f2; 


n'r M 


II- 




T; 


n- 


n 


vz M 






r r'- 


Q-, 


n\- M 






T; 


17^ 




\- vz M 


'P-, 


0; 


0; 


0; 0 h M 






0; 


0; 


0; 0h!M 





A',r' are ^-partners 



Table 2. The typing rules for amir 



3. if a; G n(l7)nn(Z\) — n(T) then Q{x) = (a,y) where a = A{x) or a = A{x) 
and if y G n(T) then cr = A{x) 

4. if n(x) G n(Z\) and T(x)+ ^ •, then x G n(l7). 





292 Paola Quaglia and David Walker 



Definition 10. A and F are x-partners \i F = {x : o®} and A = {x : (o^)’*'}, 
or r = {a; : 0“^} and Z\ = {a; : 6 '^}, or F = {x : a} and Z\ = {a; : a}. 



The origin of the subtlety of the type system is, as one might expect, the 
separation between sending and receiving. The crux is to find an appropriate 
rule for typing compositions. The most delicate point is how to capture com- 
patibility of an output particle and an input-prefixed process when the subject 
of the particle is a primary name and the subject of the top-level input prefix 
is a secondary name. To make this clearer we first examine in detail how the 
translations of x{a\a2)- 0 and x{yiy2)- 0 are typed. Suppose A(s) = {tit2) so that 
in Q\ we have 



where is s*. 



Let F = {x : s,ai : ti,a2 ■ ^2} and Fi = F,{yi : ti} and F2 = F,{yi : h,y2 : ^2}- 
Then the type inferences are: 



F \ 0; {v2 : 5'^^ }; 0; 0 I- V2a2 






if'; 0 ; {t)i : }; 0 ; 0 I- uiOi 



0; {v2 : }; 0; 0 h V2a2 \ 0 

F; {w : <T2}; 0; 0; 0 I- w(v2)- {^a2 \ 0) 



0; {w : o®}; 0; 0 I- xw 



if'; {u> : CT 2 }; {ui : <5“^^ }; 0; 0 h tJiai | w{v2). (d2<3.2 I 0) 
If'; {tLi : CTi}; 0; 0; 0 h (vlai \ w{v2). (W<22 I 0)) 



F: {w : cTi }; {tLi : o®}; 0; 0 h xw \ w{vi). (tJiai | tu(u 2 ). ( 1)202 I 0)) 

F] 0; 0; 0; 0 h |[5'(aia2)- O] = i/w {xw \ w{vi). (ijlai | w{v 2 ). ( 1 I 2 O 2 I 0))) 



F2', 0; 0; 0; 0 h 0 



if"!; 0; {w (72, V 2 ■ 0 “^^ }; 0; {(lO> D 2 )} ^ WV 2 



if'i; {v2 : }; 0; 0; 0 h i;2(y2) - 0 



if'i ; {v 2 : <5'^^ }; {w : a 2 ,V 2 ■■ 



2}; 0; {(m, U 2 )} I- WV2 I 1 ' 2 (l/ 2 ) - 0 



if’; 0; {w : a\,Vi ; 



1 }; 0 ; {(iu, Di )} I- wvi 



if'i; 0; {id : CT 2 }; 0; 0 I- Af = UV2 (wv2 \ D2(i/2)- 0) 
F] {ui ; (5'^i }; 0; {iLi : (cT 2 , Ui)}; 0 h Di (yi ). N 



<f';{i;i : 5'^^ }', {w : (7i, vi : 



1 }; {iu ; (tT2, Di)}; {(in, wi)} I- wvi \ Vi(yi). N 



(*) 



if'; 0; {in : oi }; {w : ^ 2 }; 0 I- vvi {wvi \ ui(yi). N) 

If'; 0; 0; 0; 0 h [[a:(yiy 2 ). O]] = x{w).wvi {wvi \ vi{yi). N) 



Note that when rule is applied to type vi{yi).N in the typing of 
|x(yiy2)- 0 ], the 17 -component in the conclusion keeps track of the fact that 
the primary name w is used for sending in the continuation of the input-prefixed 
process. This information is vital for checking the admissibility of possible par- 
allel compositions. The judgment 

J={<F;{vi: (5'^!}; 0; {id : (o 2 ,di)}; 0 b vi{yi). W 2 {wv 2 \ V2{y2)-0)=vi{yi). N ) 

illustrates this point well. First, to type the composition of vi(yi).A^ and an 
output particle of the form wu, the datum u must be vi. (The reader may care 
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to check that par is applicable at (-*-) in the inference above, using clause 2 of 
definition [9l) But secondly, things are quite different when vi{yi). N is composed 
with an input-prefixed process of the form w{u). M (see clause 3 of definition [O]) . 
For consider the first two reduction steps in 

\x{aia2).P I x{yiy2).Q\ — |-Pl | , 

when P and Q are both 0. Then since typeability should be invariant under 
structural congruence, it should be possible to ‘compose’ the judgment J both 
with 

iF; {w : tJi}; 0; 0; 0 h w{vi). (UTai | w{v 2 )- (^202 | 0)) 

(consider the process reached after one reduction step), and with 

!F; {w : (T2}; 0; 0; 0 F w { v 2 )- (^202 | 0) 

(consider the process reached after the second reduction step). 

Notation 11. We write S for Z\; T; 17; 77, and similarly Sq for 
iFo; Ao; Pq] 17o; Pq, and S' for W] A'] P'- f?'; 77' etc. 

Due to lack of space we omit some basic lemmas about the type system. 
We state, however, that typing is preserved by the translation, and that the 
translation of a p7r-term has no other typings: 

Lemma 12. 1. If iF h P then •7'; 0; 0; 0; 0 h |P]. 

2. If P h |P] then Z\ = P= 77 = 77 = 0and!FhP. 

Other important facts about the type system, the latter being a special case of 
a more general result, are: 

Lemma 13. 1. If 77 h 717 and TV = M, then 77 h TV. 

2. If W- 0; 0; 0; 0 h TV7 and M — > TV7', then W- 0; 0; 0; 0 h M' . 

These essential properties of a viable type system are not easy to achieve in con- 
junction with lemma fU?) below, which is crucial for the proof of the completeness 
of the type system. 

We now define the appropriate form of barbed congruence for amTr-processes, 
based on the type system. First we have: 

Definition 14. 77 is balanced if 

1. n(!F) n n(Zi, r, 77) = 0 

2. A,r,f2,n are compatible 

3. n(Z\) = n(P, 77). 

The class of omTr-processes that are typed by balanced 77s contains the transla- 
tions of A-processes and enjoys good closure properties. We then have: 

Definition 15. 1. TV is a \°‘”'{S)- context if for some balanced 77' there is an 

inference of 77' h TV in which the hole is typed by 77 h [•]; and TV is m-closed 
if in addition Z\' = P' = 77' = 77' = 0. 

2. M and TV are barbed X''"'- congruent, M TV, if there is a balanced 77 
such that 77 h TV7, TV and 7V[TV7] « 7V[TV] for every m-closed A“"'(77)-context 
TV. 
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4 Summary of Main Results 

The principal result is that two A-processes are barbed A-congruent if and only 
if their translations are barbed A“™-congruent: 

Theorem 16 (Full Abstraction). |P] |Q] if and only if P Q- 

The ‘only if’ is the easier assertion to prove. The main lemma needed for it 
is: 

Lemma 17. If i? is a A-process then |i?] « R. 

Using this lemma, the proof of soundness is completed as follows. Suppose 
|P1 IQ] and A7 h |P1, |Q| where A7 = P; 0; 0; 0; 0, so K[{Pl] « A[|Q|] for 
every m-closed A“™(I7)-context K. Then S' h P, Q, and if C is a A(P)-context 
then IC] is an m-closed A“'"(A7)-context and hence 

C[P] « lC[P]j = [CniP]] « [CniQl] = [C[Q]1 « C[Q] , 



so P «A Q- 

As just noted, the translation of a A('f')-context is a A“™(P; 0; 0; 0; 0)-context. 
The class of A“"*(A')-contexts contains much more than just parts of translations 
of A(if')-contexts, however. To prove completeness we have to understand pre- 
cisely what it does contain. The main lemma needed is: 

Lemma 18. If A is a A“'"(A7o)-context with Sq balanced, then there is a A(<?o)- 
context C such that if tf'o b M then iyuK[M] M] where u = n(A) 

and Uq = n(Ao). 

Proof. The proof is a fairly complicated induction on the inference of A7 h A. 
Let M* = vuq M. We sketch just the argument for composition. 

Suppose that A = A' | A" and A7 h A is inferred from S' h A' and 
S" h A". There are several cases. We outline the argument for just one of them: 
when S' and S" are not balanced and there is w such that P'{w) = A"{w) = a. 
First it can be shown that there is A* = A such that 

A* = \uv \vx (Wv I w{u). [va ](ua \ Ai) | v{y). A 2 [ | A 3 ] ) 

where u ^ fn(Ai) and A 3 , if present, is balanced. Note: [ ] around an expression 
indicates that it may be absent. Further, | A* | < | A |, where | T | is the size of 
a term T. 

Then let A* = wv \ w{u). [i'a]{ua \ Kf) \ v{y). A 2 and suppose that S\ h 
where n(Z\i) = Ui. Then vu\ K* i^ui A” where 

K** = [va](va \ Ai) | v{y).K 2 

and AJ* is balanced and | A** | < | AJ |. 

By induction hypothesis there is Ci such that i^Ui K**[M] |Ci][M*j. 

Moreover, if A 3 is present in A* then by induction hypothesis there is C 2 such 
that VU 2 A 3 [M] IC 2 HM*] where S 2 F A 3 and 1 x 2 = n(Z\ 2 ). 
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Set C = vx {Cl [\ C2] ). Then: 

vuK[M\= vuK*[M\ 

= [vv {vx {vui Kl[M\ [ I VU 2 AT 3 [M]] ) 

[vv ]vx {vui Kl*[M] [ I VU2 K3,[M\] ) 
vx{lCi\[M*\[\lC2\[M*\]) 

= \C}[M*]. 

□ 

Using this lemma we may show completeness, that is, if P Q then |P] 

IQ]. For suppose that 'P \- P,Q and let S be >F;0;0;0;0. Then S is balanced 
and S h IUIjIQ]. Moreover if K is an m-closed A“"*(i7)-context then using 
lemma \n\ and lemma ITSl there is C such that 

KilPj] = IC[P]] « C[P] « C[Q] « [C[Q]1 = ICMIQl] 

KllQl] ■ 

Hence [P| [Ql- 

This completes the outline of the proof of the full-abstraction theorem. □ 
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Abstract. Although type inference for dependent types is in general 
undecidable, people experience that the algorithms for type inference in 
Elf programming language stop in common cases. The present paper is 
a partial explanation of this behaviour. It shows that for a wide range 
of terms — terms that correspond to first-order logic proofs — the for- 
malism of dependent types gives decidable type inference. We remark 
also that imposing that the context and the type of a judgement are 
first-order is not sufficient for obtaining decidability. 



1 Introduction 

Lambda calculus with dependent types is a formalism defined in |HHP87| in or- 
der to provide a means for defining logics. For example, one can define first-order 
logic within the formalism. This definition leads to a restriction on dependent 
types which constitutes by itself an interesting type system for A-terms. 

Dependent types formalism has also been used as a base for the programming 
language Elf imn . The clauses of Elf are expressions of dependent types. This 
allows to reason about properties of programs inside the language. Although the 
problem of inferring types in the language is undecidable, as shown in |Dow93| . it 
comes out that for many practical programs the algorithm used in the framework 
halts. This paper is a partial explanation for the phenomenon. The type inference 
for a wide range of terms: terms that correspond to proofs in first-order logic, is 
decidable. 

Interestingly enough, the border-line between decidability and undecidability 
is very slight here. The problem of type inference for the first-order logic inside 
dependent types is defined as follows: given a first-order context F and a Curry- 
term M, check if there exists a first-order type such that F \- M : r is the 
end of some first-order derivation (i.e. a derivation that can be translated into 
first-order logic). If the condition that F \- M : r is the end of some first-order 
derivation is relaxed so that F \- M : t may be the end of any derivation in 
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dependent types, then the problem becomes undecidable. This holds even for 
a class of terms M that fall within an extension of the first-order logic where 
quantification over first-order function symbols is allowed. 

The techniques used in this paper are based on the old idea that typing prob- 
lems correspond to problems of solving appropriate equations. We reduce type 
inference for first-order logic to special kind of equations with explicit substi- 
tutions. These equations are subject to a further reduction that is very similar 
to usual Robinson’s unification. So obtained equations are translated in turn to 
second-order unification equations. As it is proved in IG 0 I 8 II . second-order uni- 
fication is undecidable, so in general it cannot serve as a method for providing 
decidability. In the present study, we get a particular form of equations which 
allows us to design a procedure to solve them. We deal only with equations of 
the three forms 

Af (tl , . . . , tn) — ^2 (^1 5 ■ ■ ■ ; 7 -^1 ... — _/*(si,...,St7i), 

. . . ,tn) — /2('5l5 ■ • ■ 7 Sm) 

where none of second-order variables may occur in terms G, . . . , si, . . . s„. 
The latter condition is very important here as when we drop it the problem 
becomes undecidable. In |Sch98| . it is shown that already solving equations f the 
form ^ 1 (^ 1 ,... ,tn) = /(si, . . . , Sm), where second-order variables may occur in 
si, . . . , s„, is undecidable. 

The paper is organised as follows: Section contains basic definitions and 
formulation of problems we deal with; Section [^presents a sketch of the undecid- 
ability result for type inference with relaxed first-order constraints, and Section |4] 
contains a sketch of the decidability result for the first-order type inference. 

2 Basic Definitions 

We introduce the definition of the system XP. The basic insight behind this 
system is that types of terms may depend on terms. ^From the perspective of 
programming language this corresponds to providing devices for defining types 
such as list(n) that represent lists of length n. From the point of view off 
logic, this allows to implement rules of substitution. We follow hereafter the 
presentation in |SU98] . 

2.1 Language of XP 

The set of pure X-terms is defined according to the following grammar: 

M ::=x\ {Xx.M) \ (MM) 

Contexts are used in the typing system as well. These are sequences of pairs: 
(a : k) or {x : r), where a is a kind variable, k a kind, x is an object variable 
and r a type. 

Pure A-terms and contexts form a base for Curry style A-calculus, XP, the 
expressions of which are inferred according to the following rules: 
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2.2 Rules of XP 

Kind formation rules: 

(type) h * : □ 
Kinding rules: 



(kind-abs) 



r,x : t\- K : D 
r h {IIx : t)h : □ 



T h /t ■ □ 

(kind-var) , {a ^ Dom(T)) 

r, a : K \- a : K 



. , ,r\-(j): (IIx : t)k r \- M : t . , , , F, x : t \~ a : * 

(kind-app) . ..r . (kind-abs)- 



Fh (j)M : k[x := M] 



F h (Va; : r)a : * 



Typing rules: 



(var) — (a ^ Dom(T)) 

i , X : r h X : T 



r \- N : (Vx : r)a F \~ M : r . ^ , F^x : r \~ M : a 
(app) nL (^bs) ' 



F h NM : cr[a; := M] 



F h Xx.M : (Vx : r)a 



Weakening rules: 

Th/t:D 

(trm — kd) (x ^ Dom(T)) 

F,x:ThK:0 



(typ - kd) 



Fh K : D Fh k' : D 
F,a:KhK':0 



(a ^ Dom(r)) 



. ,Thr:* F \- <b \ n , 

(trm -typ) — (x ^ Dom(T)) 

i , X : r h 0 : K 



F h K : O F \- S \ kF 



Thr:* F \~ M ■. a 

(trm — trm) ; — — (x S Dom(T)) 



F h K : O F \- M '■ cr , , 

(typ - trm — (a (f Dom(T)) 

1 ,a : k\- M : a 



Conversion rules: 



,, , F \- (f> : K K =3 k' F \- M ■. a a =3 a' 

(kd — conv) — — — (typ — conv) - 



F'r 6 -. k' 



F'r M -.a' 
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The definitions in Subsection 12.21 and l2. II allow to infer types for Curry-style 
terms. The system XP is usually defined in the Church version as follows. First, 
raw expressions are described according to the following grammar: 

r ■■■■= {} \ \ r, (a : k); 

K ::= * I {IIx : 4 >)k; 

(j)::=a\ (\/x : 4>)(j) \ 

M ■.■=x\ (MM) I (Aa; : 

Some of these expressions, designated by inference rules, are called Church terms. 
The inference rules have exactly the same form in most cases as the rules for 
inferring Curry types. The exception is the rule (abs) which looks as follows 

r,x-.T^M-.a 
^ M h Ax : r.M : (Va; : r)fT 

Note that the definition of types for the Church version also changes — the 
definition depends on the definition of terms which is different in Curry and 
Church styles. These versions are essentially equivalent as shown in |vBLRU97] . 

In this document, we use the word ‘subtype’ to mean subexpression. We do 
not employ any other kind of ‘subtyping’ relation here. 

Definition 1 (type assignment) 

A derivation is a tree labelled with rules of XP so that for each node its label 
premises are in bijection with conclusions in labels of its sons. Derivations are 
usually denoted by letters like V,Q. . . 

We say that a derivation V assigns a type t to a term M iff the derivation 
ends with the assertion T h M : r for some P. 

Except where stated explicitly otherwise, we write (Vx : a)r as a t pro- 
vided that X does not occur free in t. 

We shall be using extensively the notation l{{'dx : (Ji)a 2 ) and l{ai 0 - 2 ) to 
denote ai together with r((Vx : (Ti)(T 2 ) and r(tri ^ (T 2 ) to denote tT 2 - 

We denote the a-conversion relation, applied for both types or A-terms by 

—a • 

We have to formulate the exact problem we should solve. The notion of 
signature is central in the syntactic part in any presentation of the first-order 
logic. Thus, we have to determine what part of XP syntax corresponds to the 
notion. 

Definition 2 (signature first-order context) 

The signature first-order context is a AP context such that: 

1. There is only one type variable 0 (which should be regarded as a type con- 
stant), representing the type individuals; 

2. All kinds are of the form 0 => • • • 0 *; 

3. There is a finite number of distinguished constructor variables, represent- 
ing relation symbols in the signature (they must be of appropriate kinds, 
depending on arity); 
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4. Function symbols in the signature are represented by distinguished object 
variables of types 0 — > • • • ^ 0 — > 0, depending on arity; 

5. Constant symbols are represented by distinguished object variables of type 0. 

A XP context obtained from a signature S is denoted by Ps- 

The proof theory for first-order logic introduces a notion of a context (en- 
vironment) in which a formula is interpreted. This context is reflected by the 
following notion (this notion is in the spirit of [SU98j 'l: 

Definition 3 (first-order context) 

A first-order context over a signature context P^ is a context in dependent types 
of the form Ijs U {xi : (f>i, . . . ,Xn : (fn} where each cfi is either first-order type 
or 0. 

The notion of an algebraic term is crucial in the presentation of the first- 
order logic. These terms have their counterparts in XP. The most straightforward 
definition of such terms looks as follows: 

Definition 4 (homogeneous first-order term) 

We say that t is a homogeneous first-order term in a first-order context P iff 

— t = X where P{x) = 0 (i.e. a: is a constant symbol or a first-order variable), 

— t = fti ■ ■ - tn where P(f) = 0 — > • • • ^ 0 — > 0 and each ti is a homogeneous 

^ ■V' 

n— times 

first-order term in P. 

The next step in our presentation is to define what is the equivalent of the 
first-order formula. 

Definition 5 (first-order type) 

We say that a type is a first-order type in the context P iff it is of the form 

— P{ti, . . . ,tn) where P{P) = 0 0 *, P 0 and each ti is homoge- 

n— times 

neous first-order term in P, or 

— (Vxi : 0) • • • (Va;„ : 0).^i — > • • • — > (frn where each (fi is a first-order type in 
P U {xi : 0, . . . ,x„ : 0}. 

At last, we have to define which derivations of XP may be regarded as first- 
order derivations. 

Definition 6 (first-order derivations) 

We say that a derivation V in dependent types is a first-order derivation iff each 
judgement P \- M : r in the derivation is such that P is a first-order context 
over a fixed signature first-order context Ps and t is either a first-order type or 
a type of the form _0 ^ ^ 0 — > 0 where n > 0. 

■V' 

n— times 

This definition of derivation allows to introduce one-to-one correspondence be- 
tween derivations in XP and some first-order logic proofs. We do not present 
details due to limited space. 

We can now describe precisely the set of problems we deal with. 
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Problem 1. The problem of type inference with a first-order context is defined 
as follows: 

Given: A Curry-style term M and a first-order context P. 

Question: Does there exist a derivation in XP that ends with P \- M : t for a 
first-order type r? 

Problem 2. The problem of first-order type inference is defined as follows: 
Given: A Curry-style term M and a first-order context P. 

Question: Does there exist a first-order derivation that ends with P \- M : t for 
a first-order type r? 

Note that in all the above-mentioned questions we assume that a context P is 
a first-order context. This is not standard for type inference problems as usually 
these are formulated with arbitrary contexts. We may assume that contexts from 
the Given parts in definitions of problems are first-order contexts as procedure 
checking if the context has the property is easy. 

3 Undecidability of Type Inference with a First-Order 
Context 

Our undecidability proof is almost identical to the proof presented in |Dow93j . 

Theorem 1. undecidability of Problem]^ The problem of type inference with a 
first-order context is undecidable. 

Proof. We present a description of changes that should be made in Dowek’s 
proof in order to get our claim. The context P 

[0 : Type; a:0^0;6:0— >0;c:0;d:0;P:0^ Type; F : (Vx : 0){Px) 0] 

used in |Dow93j should be replaced by 

[0 : Type; a:0^0;6:0^0;c:0;d:0;P:0^ Type; F : (fix : 0)(Px) ^ (Pc)]. 

The replacement enforces only a little change in types used in the proof (some 
occurrences should be replaced by P(c)), but this does not harm reasonings in 
Dowek’s proof. 

The undecidability is essentially obtained because we can quantify over first- 
order function symbols here. The type Vxi : 0 ^ 0 • • • Vx„ : 0 ^ 0.(/3xi • • • x„) 
used in the proof is the only element that goes beyond first-order logic. 

4 First-Order Type Inference 

This material presents a proof for decidability of first-order type inference where 
signatures have at least one constant symbol. Such signatures give rise to a very 
wide class of instances and the restriction does not seem to be significant. In 
fact, the construction mentioned here requires only some minor modifications in 
order to provide a solution for the full problem. We lay aside the most general 
presentation for the sake of simplicity. 



Type Inference for First-Order Logic 303 



4.1 Generation of Equations 

In our algorithm, we use some equations. Thus, we have to define the entities to 
be equated. 

Definition 7 (e-terms) 

The set of e-terms over the signature E and variables X, denoted by T^{X), is 
defined as follows 

- x€ TIj{X) iix€X; 

- f{ti, . . . ,tn) € T§j{X) ii f G E, has arity n (n > 0) and ti G T^{X) for 
z= l,...,n; 

- t{x := s) G T^{X) where t G T^{X), x G X, and s is an e-term over E with 
variables from X. 

We extend this notion to types using a set X of type variables. 

Definition 8 (e-types) 

The set of e-types over the signature E , variables X and type variables X, de- 
noted by T£{X, ff), is defined as follows 

- P(ti,...,t„) gTJ(X,T’), ift, gT|(X) fori = l,...,n; 

- a G r^{X,X), if a G A”; 

- n ^ T 2 G TJ(X, X), if n, T 2 G rj(X, X) and x G X; 

- (Vx : 0 )t G T£(X, X), if r G T£(X, X) and xGX; 

- t{x := s) G T£{X,X), if r G T£{X,X) and s is a homogeneous first-order 
term over E with variables from X . 

We use the notation t{x := t) in order to shorten r{xi := ti) • • • {xn ■= tn). 
The set of free first-order variables in an e-type (e-term) r, denoted by Vars(r), 
is defined so that x is bounded in (Va:)r' and t{x := t). 

We introduce the notation TV(r) to denote the set of all type variables in 
T. Notations Vars(- • •) and TV(- • •) are extended so that they can be applied to 
sets of e- types. 

The notion of (semantical) substitution is not straightforward here so we 
present its definition. This notion describes a different operation than the one 
defined later in Definition fT!^ 

Definition 9 (first-order substitution) 

A first-order substitution is a partial function from first-order variables to e- 
terms with finite domain. We usually denote such a substitution by [xi := 
t\,...,Xn ■= tn\- This function acts on e-terms so that no free variable gets 
bounded, which may be expressed as follows: 

Xi [xi . — , . . . , Xi . — ti , . . . , Xn . — tn] — ti, 

- y[xi := ti, . . . ,Xn ■= tn] = y where for each i = 1, . . . ,n we have Xi yf y\ 

- f{si,...,Sn)[xi :=ti,...,Xn := t„] = f{s[,...,Sn) where 
Si — Si[xi . — ti, . . . , Xn . — tn]^ 

- t{x := u)[x\ := ti,...,Xn ■= t„] = t'{x' := u') where u' = u[x\ := 
ti, . . . ,Xn ■= tn], the variable x' does not occur in any of terms Xi, . . . , x„, 
ti,...,tn and t' = t[x •.= x'][xi := ti, . . . ,x„ := t„]; 
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G;[xi . — ti, ■ ■ ■ , Xn ■ — tn] — 

- P{si, . . .,s„)[xi := := i„] = P{s [, . . . , s'„) where 

Sj — 5i[xi . — ti, . . . , Xn ■ — 

- n ^ T 2 [xi := ti, . . . , Xn := ^ where t' = Tj[a;i := ti, . . . , := 

^n] j 

— ((Vx : 0 )t)[xi := ti, . . . ,x„ := t„] = ((Vx' : 0)r') where x' does not occur in 

xi, . . . ,x„, and t' = t[x := x'][xi := ti, . . . ,x„ := t„]; 

— r(x := m)[xi := ti,...,x„ := t„] = r'(x' := u') where u' = u[x\ := 
ti, . . . ,x„ := tn], the variable x' does not occur in any of terms xi, . . . , x„, 

and t' = t[x := x'][xi := ti, . . . ,x„ := t„]. 

The set Paths(T) of paths in an e-type t (an e-term) is a set of sequences 
of natural numbers defined so that subsequent numbers represent which part 
of an e-type or an e-term is taken. For instance, the path 12 points to in 
(Vx)(ti ^ T 2 ), and the path 3 points to t in r'(x := t). 

We have already introduced types with explicit substitutions (e-types). These 
substitutions allow to delay some substitution until a type variable is substituted, 
but then substitutions must be applied. The whole just described work is done 
by -^-reduction. 

Definition 10 (reduction for e-terms and e-types) 

The reduction for e-terms is defined as: 

1. ti = /(si,...,s„) /(sj, . . . , s(j) where for some i € {l,...n} we have 

Si s' and for j yf z we have Sj = s' ; 

2. t(x := s) t'(x := s) when t t'; 

3. ti = t{x := s) t[x := s] when t is irreducible ([x := s] is the usual 
substitution); 

4. n = P{si , . . . , s„) and T 2 = P{s'i, ■ ■ ■ , s(j) for some predicate P G E and for 
some z G {1, . . . n} we have Si s' and for j ^ i we have Sj = s' ; 

5. Ti = (Ti ^ (72 and T 2 = a{ ^ (J 2 where a\ a{; 

6. Ti = (7i ^ (72 and T 2 = cti ^ a '2 where (T 2 a' 2 ; 

7. Ti = (Vy : 0)(7i and T 2 = (Vy : 0)cr( where a\ cr(; 

8. Ti = a{x := s) and T 2 = (7'(x := s), where s G T^{X), x G X, and cr a'; 

9. Ti = (<Ti — > fT 2 )(x := s) and T 2 = cri(x := s) ^ ct 2 (x := s), where s G T^{X), 
X G X'^ 

10. Ti = ((Vy : 0)cr)(x := s) and T 2 = ((Vy : 0 )(t(x := s)), where s G Tfj{X), 
X yf y (if X = y perform a-conversion first and then reduce according to the 
present rule). 

11. Ti = P{ti, . . .,tm){x ■■= s) and T 2 = P{t[, ■ ■ ■ ,t'^), where s G T^{X), x G X, 
P G S and has the arity rrz, and t' = tj(x := s) for z = 1, . . . , rrz. 

As usual, we extend to its reflexive-transitive closure 

We point out that according to the definition above an e-type of the form 
a(x := t), where a is a variable, is irreducible. 

The reduction has several good properties the proofs of which are omitted 
here: it has Church-Rosser property, it is strongly normalising, and decidable. 
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Thus, we can define that NF~^{t) and NF~^{t) which are normal forms of re- 
spectively the e-type t and the e-term t. 

The following interesting fact gives a nice insight about what is going on in 
e-types. 

Property 1. If t is an e-type in the normal form such that TV(t) = 0 then it 
has no subtype (subterm) of the form a{x := t) {s{x := t)). 

Definition 11 (equality for e-terms and e-types) 

The equality for e-terms and e-types is defined as the least congruence containing 
U =Q, and is denoted by 

Definition 12 (e-equation) 

We write ti = T 2 to denote that the pair of e- types ri, T 2 is an e-equation. Sets of 
e-equations are denoted by S,T, — The set X') is the set of e-equations 

among e- types from X). 

Now, we define a notion of substitution we deal with. 

Definition 13 (substitutions) 

Each partial function from type variables to some Tf.{X,y) is called a substi- 
tution. 

We extend a substitution S : X ^ 3^) to e-types inductively as follows: 

- 5(0) = 0; - S{{\/x : 0)(T2) = (Vx : 0)5(a2); 

- S{P{tx,. . . ,tm)) = - 5((Ti ^ (T 2 ) = 5((Ti) ^ 5(<J2); 

P{ti,...,tm.); - S{a{x := s}) = S(a)(x := s). 

— 5(a) = a if a ^ Dom(5); 

— 5(a) = 5(a) if a € Dom(5); 

Note that in the definition above we do not have any kind of renaming of 
individual variables while substituting under quantifier. This approach is inten- 
tional here. We agree with the fact that some symbols may get bounded during 
such a substitution. 

Definition 14 (solution of a set of equations) 

We say that a substitution S : X ^ Tf.{X, 0) is a solution of a set of e-equations 
£ iff for each e-equation [n = & £ we have 5(ri) ~ 5 (t2). 

We define S{F)~^ for a context F as the sequence F with each x : r replaced 
by X : NF'^{S{t)). 

We cannot hope for a most-general solution property here. 

Example 1. Consider a signature context Fjj = {P : 0 ^ *,c : 0} 
equations £ = {a{x := c){y := c) = P{c)} has two solutions 5i(a) 

52 (a) = P{y), but there is no 5s such that S 30 82 = S\ or 5s o S\ 
neither S\ nor S 2 can be the most general solution. 

Definition 15 (generation of equations) 

Here is a nondeterministic procedure gener that takes as an input a Curry- 
style term M, an enriched first-order context F, and a path p (intentionally 
leading to M in a bigger term) and generates a set of e-equations included 
in ff) where cq,ci are fresh first-order constants. The procedure 

follows 



. The set of 
= P{x) and 
= S 2 . Thus, 
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1. geneT^x, r, p) = {ax,p = ^(a:)} when x G Dom(r) and r(x) ^ 0; 

2. gener(a;, F, p) = {cq = ci} when x ^ Dom(/^) or a; G Dom(r) and F{x) = 0; 

3. gener(M7V,r,p) = {aM,p i = (Vx : 0)a°MN,p^ o;MN,pi^ ■= = o:mn,p} U 

Sm U £jv provided that iV is a homogeneous first-order term, x is a fresh 
first-order variable, Em = gener(M, F, p ■ 1) and £n = gener(iV, F, p ■ r); 

4. gener(MiV, F, p) = {aM,p i = OiN,p-r otMN,p} U Em U En provided that N 
is not a homogeneous first-order term, £m = gener(M, _T, p • 1) and En = 
gener(iV, r, p • r); 

5. nondeterministically choose one of either fISal l or iU: 

(a) gener(Ax.M, r, p) = {a\x.M,p = ^ aM.p i} U Em where the set of 

equations Em = gener(M, F U {x : cnx,p}^ p • 1), 

(b) gener(Ax.M, _r, p) = {a\x.M,p = (Vx : 0)aM,p /} U Em where the set of 
equations Em = gener(M, T U {x : 0}, p • Z). 

We divide the set of variables X so that X = Xq{J Xx where Aq n Afi = 0 and 
X\ = {aM.p I M G AP, and p is a path }. 

Theorem 2. There exists a nondeterministic algorithm which for each first- 
order context F and a Curry style term M has a run that gives a set of e- 
equations £ such that the following sentences are equivalent: 

F There exists a type r such that F F M : t has a first-order derivation. 

2. E has a solution. 

Proof. The algorithm is described by the procedure gener. The procedure gives 
nondeterministically a set of equations E. By straightforward induction on the 
term M, we prove both implications of the theorem. Details are omitted due to 
lack of space. 

4.2 Simplification of Equations 

Generally, types in equations from the previous subsection contain first-order 
quantifiers of type 0 and arrows. We get rid of arrows using a procedure similar 
to the one in Robinson’s unification. 

In order to shorten the notation, we should denote by V”r an e-type of the 
form (Vxi : 0) • • • (Vx„ : 0)r where n > 0. In order to distinguish different V"’s, 
we sometimes supplement them with a subscript. 

We present a procedure to simplify equations. The general idea behind the 
procedure is to unify equations in the fashion of Robinson’s unification with 
additional work connected with pushing explicit substitutions and first-order 
quantifiers to leaves. 

Definition 16 (simplification procedure) 

The procedure processes step by step pairs {Q, S) where Q is a sequence of 
equations to be solved and S' is a substitution. The input of the procedure is a 
pair (Qo,0)i where Qo is the set of equations we are interested in. 

The intended property of the abovementioned substitution S is that if equa- 
tions in Q are solvable by a substitution S' then S' o S solves Qo- 
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The procedure terminates either when it explicitly fails or when the sequence 
Q consists only of pairs of one the following three shapes: 



V”o(a; := t) = ¥30' (y := s) 


(1) 


y^a{x := t)=Y.^P{si,...,Sm) 


(2) 


V?(P(ti,...,t„) =V^P(si,...,s„) 


( 3 ) 



where x, y, t, s stand for appropriate vectors of variables and terms, and n > 0. 

In the procedure, we use two kinds of type variables: normal variables and 
travelling variables. Travelling variables are used only in the proof of termination. 
They may be omitted in a working version of the algorithm. All variables in the 
input are marked as normal. 

At each step, the following cases are checked (we omit cases symmetric 
wrt. =): 

1. Let Q =11 V"((Ji ^ <J2){y ■= t) = T \\ ■ Q! . The present pair is transformed 
to 

(II V"(ai(y:=t)^a2(y :=*))= T II ■ Q! ,S). 



2. Let Q =11 V"((Vx : 0)cr2)(y := t) = t \\ ■ Q' . The present pair is transformed 
to 

(II Y\(yx' : 0)a2(x := x'){y' := t')) = r || • &,S) 

where x' does not occur in any of terms in t, and {y' := t') are those explicit 
substitutions for which y’s do not occur in V”. 

3. Let Q =11 V”P(si, • • • , Sm)(y := t) = r || • Q' . The present pair is trans- 
formed to 



(II V"P(si(y := t), . . . , s™(y := t)) = r || • Q', S). 

4 . Let Q =11 V”((Ji — > (T2) = V2 (ti ^ T2) || -Q'. The present pair is transformed 
to 

(II V^ai = V^n II • II V >2 = V^T 2 II • Q', S) 

5 . Let Q =11 V”a(a; := t) = V^ti ^ T2 || • Q' where x is the set of variables 
xi, . . . , Xmj the vector t is the set of terms ti, . . . , tm, and a is a type variable 
such that no cycle containing a in the graph Gq has an edge from Eg. The 
present pair is transformed to 

(II ¥^{ai a 2 ){x := t) = V^n ^ T2 || • Q'[a := ai 02)], 

[a := «i ^ 02] o S) 

where 01,02 are fresh variables. Additionally, we mark variables 01,02 as 
travelling. 

6 . Let Q =11 V"o(a; := t) = V 5 ((Vy : 0 )r) || • Q' where x is the set of variables 
xi, , Xm, the vector t is the set of terms ti, ... ,tm, and o is a type variable 
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such that no cycle in the graph Gq contains a vertice with a and an edge 
from Eg simultaneously. The present pair is transformed to 

(II ■■ 0)ai){x ■■= t) = : 0)r) || • Q'[a := ((Vy' : 0)oi)], 

[a ■- ((Vy' : 0)o!i)] o S) 

where a\ is a fresh type variable, and y' is a fresh first-order variable. Addi- 
tionally, we mark variable a\ as travelling. 

7. Let Q =11 cr = r II ■ Q' and let cr = r be of one of the shapes (jT}l^. The 
present pair is transformed to 

(Q'- lk = r||,5). 

8. In all other cases fail. 

Theorem 3. 1. The procedure terminates for all inputs of the form 

2. A system Q has a solution iff the result of the simplification procedure ap- 
plied to (Q, 0) is (Q',S) where Qf has a solution and each equation in the 
sequence is in one of the forms (1-3) described in Definition \lfA 

Proof. The termination is obtained due to similar reasoning to the one in Robin- 
son’s unification. For the proof of the second claim, one should show that each 
rule of the simplification procedure is sound and complete and then the claim 
follows by induction. Details are omitted due to lack of space. 

4.3 Removal of First-Order Quantifiers 

We obtain a set of e-equations by means of the simplification procedure. These 
equations have a special form. They still contain first-order quantifiers which do 
not allow for a direct translation into second-order unification. We introduce a 
procedure to remove them. The procedure uses as a intermediate data structure 
a special kind of graph which is defined as follows 
Definition 17 (graph of fixing) 

A graph of fixing for a set of equations £ is defined as each graph with vertices 

Vs=X£X TV(£) 

where Xs is the set of first-order variables quantified in £, and TV(£) is defined 
as the set of type variables in £. The edges of such graphs are meant to be 
unordered pairs. 

We will start the procedure of removal of first-order quantifiers with a fixing 
graph in which an edge between (x, a) and (x' , a') informs that there exists an 
equation with a and a' where x and x' are quantified in the same place in both 
sides of equation. We will be processing fixing graphs then in order to approach 
the situation that each edge between ix,a) and informs that either x 

and x' should occur at exactly the same places in terms that result in applying 
a solution to a and a' respectively. 

The procedure that removes quantifiers gives as a result a new set of equations 
with some negative constraints. These constraints say that some symbol may not 
occur in a type variable. We remove quantifiers as follows: 
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Definition 18 (removal of first-order quantifiers) 

The input for the procedure is a triple {S,X,£) where 17 is a signature, If is a 
set of first-order variables, and £ is a set of e-equations included in £^{X, X). The 
output is a triple {S', £' , (j)), where E' is a signature, £' is a set of e-equations in- 
cluded in £7|,,(0, X), and (f) : TV(£') — > P{Xs). Equations are modified according 
to the following schema: 

1. We build a graph of fixing Gg and (j>sXX{£') P{Xg). They are the result 
of the hereafter mentioned iteration: 

(a) We begin with = {Vs,Eo) and (j)o : TV(£) — > P{Xg) where 

Eo = {{{xi,a), (a;', a')) I 

[Vxi . . Xxi . . . Va;„a(- • •) = . . .Vx' . . Xx'^a' {■ ■ •)] S £} 

(j)o{a) = 0 for each a. 

(b) We transform into only if there exists a path p in G^ from 

(xi,a) to (xj,a) where Xi yf Xj. We define En+i and 4>n+i as follows 

i. take an edge in p — ((y,/3), (y', /?')). 

ii. remove the edge — the resulting set is 

iii. for 13, 

(/)„+! (7) = 4>nix) U {y} when there is an equation in £ 

Vzi . . . Vzj . . . \/zmf3{z'^ := t) = \/z[...\/zl... 'iz'^0lp ■ ■) 

where Zi = y and z[ = y' and does not contain y, 

(/)„+! (7) = when the former condition does not hold. 

2. We produce a new set of equations £' by means of two steps: 

(a) we generate a function 'ip :Vg ^ Const such that ip~^{c) is either 0 or a 
connected component in Gg; 

(b) we remove quantifiers in each equation: 

Vzi . . . Vzi . . . Vz„a(2:^ ■= . . . V2 ' . . . yz'^!3{z^ := t^) 

using the following rules 

— if {zi,a) and {z[,(3) are in the same connected component then we 
replace both the variables by tp{{zi,a))] 

— if (zi,a) and {z[,(3) are in different connected components then 

• if Zi G we replace Xi and Xj by ^/>((zi, /?)), 

• if G 4>{(E) we replace Xi and xj by tp{{zi,a)), 

(if both cases hold we take the first option) . 

The signature we return contains: all the symbols from E, all the symbols 
from X, and all the first-order constants introduced in the abovementioned pro- 
cedure. 

The following fact is necessary to establish the correctnes of the abovemen- 
tioned definition. 
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Property 2. Let Vzi . . . Vzi . . . \lzna{z^ := t^) = \/z{ . . . Vz' . . . \lz'„P{z'^ := t^) 
be an equation. If Gk does not contain the edge ((zi, a), (z', /3)) then either 
-Ji G ^fc(a) or z' G </<fe(/3). 

Proof. Induction on k. 

The following fact explains why we should break paths from (xt, a) to {xj, a). 
Existence of such path means that Xi and xj are equal. 

Property 3. Let p be a path in Gk for some k and let S' be a solution of £. If for 
each edge ((zi, a), (z', /?)) on p and for each equation of the form 

Vzi . . .Vzi . . .VznO;(z^ := = Vz[ . . .\lz[ . . .\lz'^l3{- ■ ■) 

we have that Zi ^ z^ , then there exists a position w such that for each vertice 
(p, 7 ) in we have that y is on w in ^( 7 ). 

Proof. Induction on the length of p. 

Property J^. The procedure of removal of first- order quantifiers 

1 . terminates, 

2. £ has a solution T : X ^ 0) iff the result £' of removal of first-order 

quantifiers has a solution T' : X ^ Tf,,{X\{E U Dom(T)), 0) 

Proof. The property is obvious. 

The proof of ((2l) has two parts. We present a scetch of them here. 

(^) If S has a solution T then we can construct a solution T' of £' . This is 
done by replacing each constant x in T{a) by if{{x,a)). As there are no paths 
from {xi,a) to (xj,a) in Gs, each first-order variable in a obtains a different 
constant. Bullets in the point 120 ) of Definition ITRI giia.ra.ntee that this operation 
results in a solution. 

(<J=) If £' has a solution T' then we can reconstruct T that solves E by simply 
replacing fresh constants by first-order variables they replaced. The existence of 
p in the point 110 of Definition [TH] guarantees that only one first-order variable 
may correspond to a constant in a type variable. 

4.4 From Equations to Second-Order 

We finally obtained a set of e-equations that can easily be transformed to a spe- 
cial form of second-order unification equations. We have to deal with constraints, 
though. The translation is defined as 

Definition 19 (translation to second-order) 

For each type variable a, let Aa be the set of all the variables x such that 
there exists a type a(y := t){x := s){z := u) in the set being translated. 
Assuming = {x \, . . . , Xn}, we replace each a{y := t) by Fa{t \, . . . , t'^} where 
t' = tj[xj+i := tj+i] ■ ■ ■ [xn '.= tn] if Xi = yj and t' = Xi if there is no yj = Xi. 

Constraints are translated to second-order constraints by replacing a’s by 
corresponding F's. 
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Immediately, we obtain the following property: 

Property 5. For a given set £ of equations of the form (1-3) in Definition llhl 
there exists a set £' of second-order unification equations such that £ is solvable 
if and only if £' is solvable. 

Moreover, the translation from £ to £' is effective. 

The transformation that allows to get rid of constraints looks as follows 

Definition 20 (removing constraints) 

Let £ be a set of second-order equations with constraints 4>. For each constant 
c € S we introduce two constants Ci and C 2 - For each second-order variable F 
of arity n we introduce two variables Fi and F 2 both of arities n + k where k is 
the number of constants in E. We define two operations | • |i and | • I 2 as follows: 

- \c\i = Q, 

- |/(tl,...,tn)|* = 

- \F{ti,. ..,tn)\i = Fi{\ti\i, \tnU,cj, . . ,,c^) where {c\ . . . ,c'=} is the set of 
all constants in E. 

Each equation ti = t 2 in £ is replaced by a pair of equations |ti|i = |f 2 |i 
and |ti |2 = 1^2 b- For each variable E in £ we supply additional equations 
Fi{a,...,a,c\,...,c^) = ^ 2 ( 0 , . . . , a, cj, . ■ • , c^) and Fi{a, . . . ,a,c^, . . . ,c^) = 
£ 2 ( 0 , . . . , a, C 2 , . . . , C 2 ) where a is a fresh constant. At last for each constraint 
4>{F) we supply the equation 

Fi(a, ...,a,c},...,Ci) = ^2(0, . . . , o, . . . , d'") 

where ck = c\ if c® ^ <P(F) and ck = if c® S </>(E). 

Immediately we obtain the following property: 

Property 6. For a given set £ of equations with constraints (j), there exists a set 
£' of second-order unification equations such that £ is solvable if and only if £' 
is solvable. 

The translation from £ to £' is effective and involves only equations of the 
form (1-3) in Definition [211 

4.5 Solving of Final Equations 

In Subsection 14.21 we obtained sets of equations. Each equation is in one of three 
forms. 

Definition 21 (head equations) 

The sets of equations in one of the following forms 



1 . 


Fi{ti, . 


■ ■ 5 ^n) 


= £2(51,.. 


■ 7 ^m) 


2 . 


Fi{ti, . 


■ ■ 5 ^n) 


= P{si,.. 


■ 7 ^771)5 


3 . 




■ ■ 7 ^n) 


= £2(31, . . 


■ 7 ^m) 



where £i ,£2 are second-order variables, and P\,P 2 symbols of first-order con- 
stants, is called the set of head equations. 
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Now, we describe a procedure to solve such sets of equations. 

We need the following facts: 

Theorem 4. complete set of solutions for second-order matching For each set E 
of second- order matching equations, if the set is solvable then it has a finite num- 
ber of solutions with domains equal to TY{E) and all of them can be effectively 
generated. 



Property 7. If the set E of second-order unification has only equations of the 
form Ei(ti, . . . ,tn) = E 2 {s\, . . . ,tm), then it has a ground solution provided 
that the signature has at least one constant symbol. 

The first one is proved in [HL78j . The second is obvious — the solution 
assigns the same term with no arguments on all the second-order variables. This 
construction is applicable, though, only if we have at least one constant symbol 
in the signature. 

Definition 22 (solving procedure) 

The nondeterministic procedure to solve our second-order equations is defined 
as follows: 

1. Check if there are equations of the form Pi(ti, . . . , = P 2 {s\, . . . , Sm), 

where the sides of the equation are different. If so fail else remove all equa- 
tions of the form P{t \, . . . , = P(ti, ■ ■ ■ ,tn) 

2. Find the complete set A of solutions for all the equations of the form 
F{ti, . . . ,tn) = P{si, . . . ,tfn) (the set exists by Theorem |4|. If there are 
no such equations then go to step |5] 

3. Choose one of the solutions and apply to all equations. 

4. Go to stepUl 

5. There are only equations of the form Fifti, . . . ,tn) = F 2 (si, . . . ,fm)- These 
equations have always a solution (provided that there is at least one constant 
in the signature) so we accept in this case. 

We have the following theorem: 

Theorem 5. The problem, if a given set of head equations is solvable, is decid- 
able. 

At last, we obtain: 

Theorem 6. The first-order type inference problem is decidable. 

Proof. A consequence of Theorem Theorem Fact and Theorem 
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Abstract. An adaptive program is an object-oriented program which 
is abstracted over the particular class structure. This abstraction fosters 
software reuse, because programmers can concentrate on specifying how 
to process the objects which are essential to their application. The com- 
piler of an adaptive program takes care of actually locating the objects. 
The adaptive programmer merely writes a traversal specihcation deco- 
rated with actions. The compiler instantiates the specification with the 
actual class structure and generates code that traverses a collection of 
objects, performing visits and actions according to the specification. 
Previous approaches to compiling adaptive programs rely on standard 
methods from automata theory and graph theory to achieve their goal. 
We introduce a new foundation for the compilation of adaptive programs, 
based on the algebraic properties of traversal specifications. Exploiting 
these properties, we develop the underlying theory for an efficient com- 
pilation algorithm. A key result is the derivation of a normal form for 
traversal specifications. This normal form is the basis for directly gener- 
ating a traversal automaton with a uniformly minimal number of states. 



Key words: object-oriented programming, semantics, finite automata, compi- 
lation 

1 Introduction 

An adaptive program |121 [T^ [TTl [15] is an object-oriented program which is 
abstracted over the particular class structure. Adaptive programming moves the 
burden of navigating through a linked structure of objects of many different 
classes from the programmer to the compiler. The key idea is to only specify the 
landmarks for navigation and the actions to be taken at the landmarks, and leave 
to the compiler the task of generating traversal code to locate the “landmark” 
classes and to perform the actions. 

This abstraction fosters software reuse in two dimensions. First, the same 
adaptive program applies unchanged to many similar problems. For example, 
consider the adaptive program Average that visits objects of class Item and 
computes the average of the field amount therein. This program can be compiled 
with respect to a class structure for a company, instantiating Item to Employee 

J. Tiuryn (Ed.): FOSSACS 2000, LNCS 1784, pp. 314-EM 2000. 
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and amount to salary^ to compute the average salary of the employees. But the 
same program can also be compiled by instantiating Item to Inventoryltem and 
amount to price. This instance computes the average price of all items in stock. 

Second, adaptive programming is attractive for programming in an evolv- 
ing environment. Here, “evolving” means that classes, instance variables, and 
methods are added, deleted, and renamed, as customary in refactoring P31EUH]. 
In this situation, many adaptive programs need merely be recompiled without 
change, thus alleviating the tedious work of refactoring considerably. 

An adaptive program consists of two parts: a traversal specification and wrap- 
per (action) specifications. The traversal specification mentions classes whose 
objects must (or must not) be visited in a certain order and the instance vari- 
ables that must (or must not) be traversed. A wrapper specification links a class 
to an action that has to be performed when the traversal encounters an object of 
that class. Following Palsberg et al [I5lll4| . our semantics only considers actions 
to be performed on the first encounter with an object. 

Although a traversal specification only mentions names of classes and in- 
stance variables that are relevant for the programming task at hand, the ac- 
tual class structure, for which the adaptive program is compiled, may contain 
intermediate classes and additional instance variables. The compiler automati- 
cally generates all the code to traverse or ignore these objects. Likewise, wrapper 
specifications need only be present for classes whose objects require special treat- 
ment. Hence, the programmer writes the important parts of the program and 
the compiler fills in the boring rest. 



1.1 Related Work 

Due to the high-level programming style of adaptive programs, their compila- 
tion is an interesting problem. Palsberg et al US] define a formal semantics for 
adaptive programs, formalizing Lieberherr’s original approach to compilation 
d!, and identify a number of restrictions. A subsequent paper m removes the 
restrictions and simplifies the semantics, but leads to a compilation algorithm 
which runs in exponential time in the worst case. Both papers rely on the theory 
of finite automata and employ standard constructions, like minimization and the 
powerset construction (which leads to the exponential worst case behavior). In 
addition, these works employ a more restrictive notion of traversal specification 
than the present paper. 

Finally, Lieberherr and Patt-Shamir m introduce further generalizations 
and simplifications which lead to a polynomial-time compilation algorithm. How- 
ever, whereas the earlier algorithms perform “static compilation” , which pro- 
cesses all compile-time information at compile time, their polynomial-time al- 
gorithm performs “dynamic compilation”, which means that a certain amount 
of compile-time information is kept until run time and hence compile-time work 
is spread over the code implementing the traversal. They employ yet another 
notion of traversal specification. While this is more general than their earlier 
work, the relation to our specifications is not clear. 



316 



Peter Thiemann 



The algebraic approach is based on a notion of derivatives which is closely 
related to quotients of formal languages and to derivatives of regular ex- 
pressions |6l|7]|l]. However, traversal specifications differ from standard regular 
expressions, so our derivatives are novel to this work. 

There is a companion paper dealing with the practical aspects of compiling 
adaptive programs by partial evaluation m- 



1.2 Contribution of This Work 

The algebraic foundations of adaptive programming are based on the algebraic 
properties of traversal specifications. Exploiting the algebraic laws, we define 
a normal form for traversal specifications. If the specification contains alterna- 
tive paths (like -I- in a regular expression) then the size of the normal form can 
be exponential in the size of the original specification, so that its computation 
takes exponential time, too. We show that the exponential bound is tight by ex- 
hibiting a suitable specification. For a specification without alternatives (coined 
“multiplicative specification” m) this step takes linear time. 

Starting from a traversal specification in normal form our algorithm computes 
the state skeleton of the uniformly minimal traversal automaton, using a notion 
of derivatives for traversal specifications. Uniform minimality means that the 
number of states is minimal over all automatons that implement the traversal 
for all possible class structures. This step takes linear time for multiplicative 
specifications and exponential time in the worst case for general specifications. 
We show that the exponential bound is tight. 

Only the final compilation step requires the actual class structure. It con- 
structs the actual traversal automaton from the state skeleton and the class 
structure. It takes time proportional to the product of the sizes of both. We 
prove that the resulting automaton implements the semantics of a traversal 
specification. Hence the automaton is equivalent to the one constructed with 
“static compilation” by Palsberg et al m- 

The main technical contribution of this work is the exploration of the algebra 
of traversal specifications, in particular Theorems [T] and El which demonstrate 
that the formally constructed automaton is indeed uniformly minimal. These 
theorems can also be viewed as normal form results for a certain class of regular 
expressions. 



Overview Section Elestablishes some formal preliminaries and defines a seman- 
tics of adaptive programs. Section El explores traversal specifications and their 
algebraic properties; culminating in the first compilation step. Section |H deals 
with the second compilation step, the construction of the uniformly minimal 
traversal automaton. Section |5] considers extensions and further work, and Sec- 
tional concludes. A companion technical report m contains an appendix with 
all proofs. 
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2 Semantics of Adaptive Programs 

This section first recalls the basic concepts of class graphs and object graphs used 
to define the semantics of adaptive programs. Then, we define a generalized (with 
respect to previous work [12 E5]) notion of traversal specifications and use it to 
define a semantics of adaptive programs. 

2.1 Graphs 

A labeled directed graph is a triple {V, E, L) where F is a set of nodes, L is a set 
of labels, and ECVxLxV is the set of edges. Write u — ^ v for the edge 
{u,l,v) e E; then u is the source, I the label, and v the target of the edge. 

Let G = (y, E, L) be a labeled directed graph. A path from vq to is a 
sequence {vqJi, V 1 J 2 , ■■■ Jn,Vn) where n > 0, vo,...,Vn G V, li,...Jn G L, 
and, for all 1 < i < n, there is an edge Vi-i Vi G E. The set of all paths in 
G is Paths(G). 

If p = (uq, h, . . . , Vn) and p' = {vq, v'^) are paths with = Vq then 

define the concatenation p ■ p' = (voji, ... ,Vn,l[, ■■■ ,v!^). For sets of paths P 
and P' let P ■ P' = {p ■ p' \ p G P,p' G P',p ■ p' is defined}. 

2.2 Class Graphs and Object Graphs 

Let C be a set of class names and Af be a set of instance names, totally ordered 
by <. A class graph is a finite labeled directed graph Qc = (C,'^^C)A/’U {O}). 

There are two kinds of edges in the class graph. A construction edge has 
the form u — ^ v where I & N {I ^ <>). It indicates that objects of class u 
have an instance variable I containing objects of class v. There is at most one 
construction edge with source u and label 1. Each cycle in Qc involves at least 
one construction edge. 

An edge u — > u is a subclass edge, indicating that u is a subclass of u. 
Without lack of generality 121I51II2] we assume that class graphs are simple, i.e., 
every class is either abstract (all outgoing edges are subclass edges) or concrete 
(all outgoing edges are construction edges). In addition, if u — > v G £c then v 
is concrete. 

Figure [I] shows an example class graph with an abstract class A and three 
concrete classes B, G, and D. Dashed arrows indicate subclass edges, solid arrow 
indicate construction edges. Class A has subclasses B and G. Class B has one 
instance variable a of class D. Class C has an instance variable b of class D and 
another c of class A. 

Let J7 be a set of objects. An object graph is a finite labeled graph (C, £o,M) 

such that there is at most one edge with source u and label 1. The edge u — ^ v 
means that the instance variable I in object u holds the object v. 

Figure E] shows an example object graph corresponding to the class structure 
in Fig. [I] The objects G1 and C2 have class C, B1 class B, and Dl, D2, and 
D3 are object identities of class D. 
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A class map is a mapping Class \ fl ^ C from objects to class names of 
concrete classes. The subclass map Subclasses : C 'P(C) maps a class name to 
the set of class names of all its subclasses, including itself. Subclasses(A) is the 
set of all B G C such that there is a path {A, O, . . . ,0,B) in the class graph. 

2.3 Traversal Specifications 

A traversal specification answers the question “where do we go from here?” 
at some point of a traversal of an object or class graph. Hence, a traversal 
specification is either B (denoting a path to an object of class B), a concatenation 
of specifications, or an alternative of specifications. Figure El shows the formal 
syntax. 

The semantics of a traversal specification is specified relative to a starting 
node A. It is a set of paths in a class graph. 

RPathSet(A, B) = {(A, li,Ai,. . ., In, A„) G Paths(t/c) I A„ GSubclasses(H)} 
RPathSet(A , pi ■ P2) = UsgTarget(pi) RPathSet(A, pi) ■ RPathSet(H, P2) 
RPathSet(A, pi + P2) = RPathSet(A, pi) U RPathSet(A, P2) 

The function Target yields the set of possible target classes of a traversal. 

Target(H) = {B} 

Target(pi • P2) = Target(p2) 

Target(pi + P2) = Target(pi) U Target(p2) 












An Algebraic Foundation for Adaptive Programming 319 



p :•.= B simple path to B 
I p ■ p concatenation 
I p + p alternative 

Fig. 3. Traversal specifications 



The definition of the semantics naturally adapts to object graphs, by replacing 
occurrences of class names with objects of the respective classes: 

RPathSetgo(A, S) = { (oq, Zi, oi, . . . , o„) G Paths(^o) I 

Class(oo) G Subclasses(A), 

Class(o„) G Subclasses(_B)} 



2.4 Semantics 

An adaptive program is a pair (p, W) of a traversal specification p and a wrapper 
map W. The map W maps a class name A G C to an action to be executed when 
visiting an object of class A. Given an object graph Qo, the semantics of (p, W) 
with respect to some initial object o is completely determined by listing the 
objects in the order in which they are traversed. Formally, 

Trav(p, o) = Seq(RPathSetgo (o, p)) 

where 

Seq(77) = ooSeq(ili) . . . Seq(il„) 

where 

|oo} ={oGf?|o...Gf7} 

{ll, . . . , In} = {^ G A/" I OqI . . . G n} li < 

Ui = {w G n{M fi)* I OoUw G 77} 

To see that Trav(p, o) is well-defined, observe that 

1. oo is uniquely determined in the first expansion of Seq() because each path 
in RPathSetgo (o,p) starts with the inital object o; 

2. oq is uniquely determined in every recursive expansion of Seq() because the 
initial segment ook of a path in an object graph completely determines the 
next object (there are no inheritance edges in an object graph). 

To run the adaptive program on o means to execute the wrappers specified 
by W in the sequence prescribed by Trav(p, o). 

3 Traversal Algebra 

In this section, we investigate some algebraic laws that hold for traversal spec- 
ifications and define a normal form for them. Working towards a compilation 
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(pi • P2) • P 3 = Pi • (P2 • ps) 

(pi + P2) + P 3 = Pi + (p2 + P3) 

pi + P 2 = P 2 + Pi 

P + P = P 

Pi • (P2 + Ps) = (pi • P2) + (pi • Ps) 

(pi + P2) • P 3 = (Pl • Ps) + (P2 • Ps) 



= P 



Fig. 4. Laws for traversal specifications 



algorithm, we define a notion of derivative for traversal specifications, where 
taking the derivative of a specification corresponds to visiting a certain node 
during a traversal. Finally, we consider the complexity of the resulting compila- 
tion algorithm. 

3.1 Algebraic Laws 

Traversal specifications obey some algebraic laws. The concatenation of specifica- 
tions • is associative. The alternative of specifications -I- is associative, commuta- 
tive, and idempotent. Furthermore, concatenation • distributes over -I-. Figure 
shows the resulting laws. 

Lemma 1. The algebraic laws given in Fig. 2] are correct, in the sense that if 
Pl = P2 is a law then, for all A, RPathSet(A, pi) = RPathSet(A, p 2 )- 

A further law compresses a concatenation of simple paths even further. 

Lemma 2. A - A = A 

Given an arbitrary total ordering on the set of class names, a traversal spec- 
ification is in normal form if it has the form w\ + W2 + ■ ■ ■ + Wn where each Wi 
is a traversal word (that is, it is generated by the grammar w ::= B \ B ■ w) 
and Wi < Wi+i in the induced lexicographic ordering. The function norm maps a 
traversal specification to its normal form. In the algorithm, □ denotes the empty 
traversal specification. 



norm(p) 



sort(norm’(p, □)) 



norm’(S, u>i w„) = prefix(B, ) -|- . . . - 1 - prefix(i 3 , w„) 

norm’(pi • P2,L) = norm’(pi, norm’(p2, L)) 

norm’(pi + p2,L) = norm’(pi, L) -|- norm’(p2, L) 




The function sort merges traversal specifications by sorting an alternative of 
words and removing duplicates. The function norm has exponential complexity 
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in the worst case. To see that, consider the family of specifications pn = + 

Bi) ■ ■ ■ (An + Bn) for distinct Ai and Bi. Each pn has size linear in n, but 
norm(p„) = Ai ■ A 2 ■ ■ ■ + . . . + Bi ■ B 2 ■ ■ ■ Bn has size proportional to 2". 

Lemma 3. For all A and p: RPathSet(A, p) = RPathSet(A,norm(p)). 

3.2 Second Normal Form 

Further simplifications are possible for specifications in normal form. First, define 
another ordering on words. A word v is less than or equal to a word w-An+i if v is 
a subsequence of w followed by a non-empty ascending sequence of superclasses 
of An+l- 

Definition 1. Let m > 0. Define v :<m w iff there is some n > 0 such that 
V — AiA2---AnBiB2---Bm+i 0''nd there exist oi, . . . , On+i such that w = 
aiAio; 2^2 • • • Anttn+iAn+i wherc An+i S Subclasses(Bi) and, for all 1 < i < m, 
Bi G Subclasses(i?i+i). 

Write V <w if there exists some m such that v w. 

We will be most interested in the special case where m = 0. It turns out that 
Aq is related to the reverse inclusion of the corresponding path sets. 

Proposition 1. The relation < is a partial ordering on the set of normalized 
traversal words. 

Corollary 1. The relation is a partial ordering on traversal words. 

Lemma 4. For all words v and w, if w v then, for all A, RPathSet(A, v) C 
RPathSet(A, w). 

The proof is by induction on v. 

The lemma does not extend to for m > 0, hence it does not hold for A. 
To see this, consider three distinct class names A, B, and D so that A - B f, D if 
D G Subclasses(A) and A G Subclasses(S). Now, RPathSet(Z3, Z3) contains (D), 
but (D) ^ RPathSet(A • B,), since every element of the latter contains A. 

As a corollary, we have the following additional law. 

Lemma 5. Suppose w v then v + w = w. 

Proof. We need to show that, for each A, RPathSet(A, u-l-w) = RPathSet(A, w). 
The inclusion RPathSet(A, w) C RPathSet(A, u -|- w) is obvious by definition. 
The reverse inclusion is Lemma |4] 

Definition 2. A traversal specification p is in second normal form (2NF) if it 
is in normal form p = w\ + . . . + Wn and, for all i,j G {1, . . . , n}, if i j then 

Wi :f,o Wj. 

To get norm to produce traversal specifications in 2NF, it suffices to have 
sort not just remove duplicates but also remove each word w if there is another 
word V such that v Aq w. Call this modified function sort'. 
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dx{A) 


= A 




1 w if A - A 


dx (A • w) 


~ [ A • w otherwise 


dx {wi - 1 - . 


. - 1 - w„) = dx(wi) - 1 - ... - 1 - dx(w„) 


Fig. 5. Derivative of a traversal specification 



3.3 Formal Derivatives 

Another view of a traversal specification is to consider it as a state in a traversal. 
For example, the specification A means “continue until you find some object of 
class A” , and the specification B ■ A means “search for some i?-object and then 
continue looking for some A-object” . Clearly, whenever a traversal visits a node 
of class X in the object graph, the traversal specification might change to reflect 
the new state of the traversal. For example, if the traversal hits upon a i?-object 
in state B ■ A, the traversal continues through B’s instance variables in state 
A. In principle, the new state should he B ■ A + A because, by definition, the 
traversal should still look for As following some B. However, Lemma [5] shows 
that A is equivalent to B ■ A + A. 

Formally, when a traversal in state p encounters an A-object, the new traver- 
sal specification for the instance variables is the derivative of the old specification 
p with respect to the class X, that is dx{p)- The definition in Fig. |5] assumes 
that p is already in normal form. 

Lemma 6. If p is in (second) normal form then so is sort' {dx {p)) , for all X. 

Computation of dx{p) takes time linear in the length of p. Subsequent nor- 
malization boils down to a single run of sort' on the derivative, which takes 
0(n^ log n) where n is the length of p (since each comparison may take time 
linear in n). 



3.4 The State Skeleton 

Let Der(p) be the set of all iterated derivatives of a traversal specification p. 
Since it only depends on p, it is possible to precompute Der(p) before a concrete 
class graph is given. 

The complexity for computing Der(p) is clearly its size |Der(p)| for some p 
in normal form. First, for words the table 



w 


Der(w) 


Der(w) 


A 






1 


A 


w 


{A • w} U Der(w) 


1 -1- per(w) 



determines the number of derivatives. That is, |Der(w)| is linear in the size of w. 
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For a general specification p = wi + . . . + Wn which also includes alternatives 
Der(wi + . . . + Wn) Q {ui + . . . + Vn \ Vi G Der(wi)} (some alternatives may be 
deleted due to Lemma it holds that |Der(wi + . . . + w„)| < |Der(wi)| • . . . • 
|Der(w„)| ^ Iwil • • • |w„|. Depending on the structure of p, |wi| • • • |w„| can range 
from linear in \p\ (for n = 1) to exponential in \p\. To demonstrate the latter, 
consider the following example. Let Wi = Ai-Bi-C where all Ai and Bi and C are 
distinct. In this case, |Der(wi + . . . + w„)| = 2" + 1 (by straightforward induction 
using Lemma[^ whereas the size of wi + . . . + Wn is linear in n. Therefore, the 
exponential bound is tight. 

3.5 Compiling Traversal Specifications 

Compiling a traversal specification means to compute the set of its iterated 
derivatives. In the case of a multiplicative specification |T3] (which does not use 
+), compilation takes linear time. Firstly, normalization of a specification boils 
down to removing repeated class names, which can be done in linear time. Sec- 
ondly, computing the normalized derivative takes unit time. Finally, the number 
of normalized derivatives is linear in the size of the multiplicative specification. 

For general specifications, compilation takes exponential time in the worst 
case, for two reasons. First, the normalization of a traversal specification may 
take exponential time and, second, the traversal specification may have an ex- 
ponential number of derivatives. 

From now on we can safely assume that the function dx{p) only deals with 
specifications in 2NF. Once the elements of Der(p) have been computed, the 
compiler assigns numbers to them and reduces the computation of dx{p) to a 
constant-time table lookup. 

Starting from a specification p and a designated source class A, a compiled 
traversal specification is a quadruple {A, P, po , d) where po G P is the initial 
traversal specification in 2NF, P = Der(po) is the set of normalized iterated 
derivatives of po, and d : P x C ^ P is the table of derivatives. 

4 Adaption to a Class Structure 

Given a specific class graph Qc = LI {O}) and a compiled traversal 

specification {A, P, po,d), the next task is to produce a target program which 
implements the traversal. The abstract model of the target program is a finite 
automaton, the traversal automaton. We describe its construction and prove its 
correctness with respect to the traversal specification. Then, we show that it is 
uniformly minimal, ie., it has the least number of states among those automata 
that implement the semantics of po and work for all possible class graphs. 

4.1 Traversal Automaton 

The first step towards the target program is the construction of the traversal 
automaton. The traversal automaton is a non-deterministic finite automaton 
A = (Q, E, S, goj F) [S] where 



324 



Peter Thiemann 



— Q = C X {in, out} X P is the set of states; 

— i7 = CUA/’U{0}is the alphabet; 

— qo = {A, in, po) is the initial state; 

— F = |(A,out,p) I A G C,p € P and there exists some B and p' such that 
p = B + p'\/p = B,A£ Subclasses(B)} is the set of final states; 

— S{{A,m,p),A) = {(^,out,p)| and S {{A, ont, p),l) = |(S, in, i9a(p)) | A -U 
B G £c} defines the transition function. 

As usual, L{A,q) is the language recognized from initial state q G Q. 

Inheritance edges in the class graph are the only sources of non-determinism 
in A. All transitions on construction edges are deterministic because there is at 
most one construction edge with a given source u and label 1. 

The next Lemma shows that the automaton A indeed implements the se- 
mantics of the traversal specification pQ. 

Lemma 7. The automaton A recognizes RPathSetpQ(A, po)- 

We actually prove a more general claim: for all A G C, for all p G P, 
L(A, (A, in, p)) = RPathSet(A, p). This can be shown using induction on the 
length n of a path (Aq, /i, Ai, ^ 2 , ■ ■ ■ Jn, A„). 



4.2 Minimal Traversal Automaton 

The step from the traversal automaton to generated code is simple. The states 
of the automaton correspond to a set of mutually recursive procedures/methods, 
each of which implements one step of the traversal and - possibly - an action. 
If there were equivalent states they would have to employ equivalent code to 
continue the traversal. In other words, the compiled code would suffer from 
code duplication. Hence, the traversal automaton should have as few states as 
possible. 

Standard results from automata theory [9l Sec. 3.4] show that a minimal 
(deterministic) finite automaton always exists. This minimal automaton has the 
property that for all its states q and q', if q ^ q' then L{A, q) yf L{A, q'), that is, 
q and q' are distinguishable. Theorem [T1 below demonstrates that the set Der(p) 
generates distinguishable states under certain assumptions on the class graph. 

Since we want to compute the uniformly minimal automaton, ie., an automa- 
ton which works regardless of the actual class graph, we make the following two 
assumptions. 

Assumption 1 (REACHABILITY) Let B be an arbitrary class. For all con- 
crete classes A there is a label I O such that A — ^ B G Sc- 



Assumption 2 (RICHNESS) Given a specification p, there is always a suf- 
ficient number of classes not mentioned in p. 
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These assumptions are technical devices to enable a proof of the following 
development. They are not the weakest possible assumptions, but rather they are 
chosen to make the proofs palatable. They are not meant to be imposed on class 
graphs submitted to the compiler. If a class graph violates the assumptions, then 
the generated automaton may not be minimal for this particular class graph. 

The intuition behind the assumptions is to guarantee the existence of a suit- 
ably connected set of classes which are not mentioned in the specification which 
is compiled. This is usually true because any given specification only mentions 
a very small subset of the classes in a system. 

From now on, REACHABILITY and RICHNESS are implicitly assumed for 
all statements. 

Theorem 1. Let pi,p 2 G Der(p), all in 2NF, such that pi ^ p 2 - 
For all A € C, RPathSet(A, pi) yf RPathSet(A, ^2)- 

We need a number of auxiliary lemmas to prove this theorem. First, we 
establish that the inclusion of path sets for words implies that the words are 
related by Aq. 

Lemma 8. Let v and w be words in normal form. 

Suppose VA. RPathSet(A, u) C RPathSet(A, w). Then w Aq v. 

The proof is by induction on the length of w. 

Since Lemma 2] provides the other implication, we have proved the following 
theorem. 

Theorem 2. Let v and w be words in normal form. 

VA. RPathSet(A, u) C RPathSet(A, w) if and only if w Ag v. 

Exploiting that Ag is a partial order immediately yields the following. 

Corollary 2. Let v and w be words in normal form. VA . RPathSet(A, u) = 
RPathSet(A, w) if and only ifw = v. 

This result for words extends to traversal specifications in strong 2NF. To 
obtain the strong form of 2NF, we replace Ag by A in the definition of 2NF. 

Theorem 3. Suppose p and p' are in strong 2NF. Then VA . RPathSet(A, p) = 
RPathSet(A, p') if and only if p = p' . 

Theorem [ 1 ] follows immediately from Theorem!^ Theorem [^characterizes all 
specifiable traversals since it establishes a one-to-one correspondence between 
traversal specifications in strong 2NF and specifiable path sets. 

From another point of view, we have proved a normal form result for a 
certain kind of regular expressions, namely traversal specifications. Theorem [3] 
completely characterizes them, by putting the languages in one-to-one corre- 
spondence with expressions in normal form. 
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1. (^,in, pq) e Q. 

2. If q = {A, in, p) £ Q then let q' = {A, out, p) in 

— q' £ Q\ and 

— li q' £ F then active(q'); and 

— if active(q') then active(q); and 

— if active(q') then (q, A, q) £ 5. 

3. If q = (A,out,p) £ Q then 

for each A — ^ B £ Ec let q' = (B, in, 9 a(p)) in 

— q' £ Q', and 

— if active(q') then active(q); and 

— if active(q') then (q, I, q') £ S. 

Fig. 6. Constraint system specifying those states of A which are reachable and 
active 



4.3 Generation of the Automaton 

The naive construction of the automaton A is too costly because not all states 
of A are accessible from the initial state. Fortunately, it is easy to restrict the 
construction of the state space of A so that only accessible states are constructed. 

Likewise, naive use of A to control a traversal leads to unnecessary visits 
because some states of A are sink states. A state g is a sink state if there is no 
path from q to a final state or, equivalently, L{A,q) = 0. The non-sink states 
are easily identified by marking all those states that have a path to a final state 
(analogous to the construction of an automaton for INIT(L) from an automaton 
for L 0). The remaining unmarked states are sink states. 

Both properties can be computed in one traversal of the reachable part of the 
automaton. This traversal takes 0(\£c \ ‘ |Der(po)|) time. The constraint system 
in Fig. 0 specifies the traversal. In the specification, the predicate active(g) is 
true iff q is not a sink state. It can be implemented using standard techniques 
in the complexity given above. 



5 Extensions and Further Work 

The algebraic approach using derivatives of traversal specifications is also suit- 
able for dynamic compilation m- In this case, the compilation time is constant 
for each class (i.e., linear in the size of the class structure) and the amount of 
work left to run time is comparable to that in Lieberherr and Patt-Shamir’s 
approach m- However, their approach employs a different notion of traversal 
specification. 

It is easy to generalize the framework to multiple source classes since traversal 
specifications do not mention source classes to begin with. Also the adaption to 
multiple target classes requires no change in the method, due to our removal 
of the notion of well-formedness. Well-formedness was only a real requirement 
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in the early work which relied on identifying the set of states of the traversal 
automaton with the set of classes [15] . The later works have been able to dispense 
with well-formedness, too [12]. 

Further operators like negation and intersection could be allowed for traversal 
specifications. While the algebraic approach seems to work with these operators 
in principle, its impact on normal forms and the remaining development of Sec. |4] 
has been left to further investigation. In the database community , more 

expressive path expressions have been considered, including l~^ (find an object 
o so that the current one is the value of instance variable o.l) and (find 
the closest reachable object with instance variable 1) [18|. These would also be 
interesting to investigate. 

6 Conclusion 

We have presented a new algebraic foundation for compiling adaptive programs. 
Our approach provides a simple and intuitive algorithm based on formal deriva- 
tives of traversal specifications, while maintaining and verifying the previously 
established complexity bounds. We have implemented the compilation of adap- 
tive programs using partial evaluation, thus substantiating earlier claims in this 
regard. We hope that this new perspective provides further insight into the 
structure of adaptive programs. 
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Abstract. We investigate the question whether the well-known corre- 
spondence between tree automata and the weak second order logic of 
two successor functions ( WS2S) can be extended to tree automata with 
tests. Our first insight is that there is no generalization of tree automata 
with tests that has a decidable emptiness problem and that is equivalent 
to the full class of formulae in some extension of WS2S, at least not 
when we are asking for an conservative extension of the classical cor- 
respondence between WS2S and tree automata to tree automata with 
tests. 

As a consequence we can extend the correspondence between tree au- 
tomata and WS2S to automata with tests only when we admit a restric- 
tion of the class of formulae. We present a logic, called WS2Sy, and a 
restriction of the class of formula, called uniform, that is equivalent to 
tree automata with tests. 



1 Introduction 

The equivalence of tree automata and weak second-order logic of 2 successor 
functions, short WS 2 S, is known for more then 30 years m- During the current 
decade, a lot of work has been done on classes of tree automata that are stronger 
than classical tree automata (see jl] and |H] for a survey) . The general frame to 
obtain these stronger classes of tree automata is to augment the automaton 
model with tests. In the case of tree automata with tests between brothers [T], 
for instance, we can write a transition rule like f{qi, 92) q that will accept 

a tree f{t\,t2) in state q if ti is accepted in state qi, t2 is accepted in 52, and if in 
addition ti = t2- These tests increase in a considerable way the expressiveness of 
tree automata. Tree automata with equality tests between brothers as sketched 
above can for instance recognize the set of balanced trees (which is not possible 
with classical tree automata). 

Defining new classes of tree automata is of course only interesting when 
the usual “nice” properties of automata are preserved. The most important of 
these is the decidability of the emptiness of the language recognized by such an 
automaton. Furthermore we expect a good class of recognizable languages to 

* Partially supported by the Esprit Working Group 22457 - CCL II 

J. Tiuryn (Ed.): FOSSACS 2000, LNCS 1784, pp. 329- fHlHl 2000. 

(c) Springer- Verlag Berlin Heidelberg 2000 



330 



Ralf Treinen 



be closed under operations such as union, intersection, complement and (maybe 
restricted classes of) homomorphisms and inverse homomorphisms. 

Several classes of tree automata with “nice” properties have been identified 
(see |2] for an excellent overview of tree automata and tree automata with tests) . 
In light of the above mentioned equivalence between tree automata and the logic 
WS2S it is a natural question whether there are similar logical characterizations 
of tree automata with tests. Before attacking this question we will try to ex- 
plain in an informal manner the logic WS2S and its correspondence with tree 
automata. 

The second-order logic WS2S comes with a standard interpretation. There 
are two sorts: the sort of words over the alphabet {1,2} and the sort of finite sets 
of words. There are a constant for the empty word, unary functions to append 
a symbol 1, resp. 2 at the end of a word, and the elementship relation between 
words and sets. An example of a formula in this logic, expressing that the word x 
is a prefix of the word y is (the convention is that word variables are written in 
lower case and set variables in upper case): 

vy(y e y A Vz([zl gYW z2gY]^z€Y)^xGY) 

A tree automaton with k states can be translated into a formula of WS2S in the 
following sense: We represent a tree over an alphabet of n symbols by n sets, each 
of them representing the occurrences (addresses of nodes) of the tree marked with 
one the symbols. The formula has to state that these sets are pairwise disjoint 
(since an occurrence can carry only one symbol), that their union is closed under 
prefix (using the prefix predicate defined above) and that the arity of the function 
symbols is respected, that there is an assignment of states to the occurrences 
(this is expressed by an existential quantification over k sets similar to what we 
did to represent the term itself) respecting the transition rules of the automaton 
such that the root is marked with an accepting state. 

Given a formula f with n free variables (by a slight variation of the logic 
we can assume that only set variables are involved) we first have to say how to 
represent an n-tuple of sets as a tree: We take as tree signature the n-tuples over 
{0, 1}, a tree representing a n-tuple of sets has at occurrence tt the symbol 1 
in the *-th component if and only if tt belongs to the z-th set. Since all sets 
in WS2S are finite a finite tree is sufficient (there is also a technically much 
more involved variant for infinite sets and infinite trees due to Rabin m)- Now, 
the idea to get an automaton that corresponds to a given formula is to first 
construct automata for the atomic formula of the logic, and then to apply the 
closure of tree automata by union, complement and projection (corresponding 
to disjunction, negation and existential quantification). 

In fact there is a problem with the translation of formulae into automata that 
is easily overseen when sketched as we did above: For this translation to work we 
need an additional closure property of tree automata (which, luckily, does hold 
for classical tree automata): closure by cylindrification. The problem is apparent 
when we have constructed a tree automaton for a formula X 2 ) and another 

one for a formula ip(X 2 , As) and when we now want to construct from these the 
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automaton for (j){Xi,X 2 ) A tjj{X 2 , X 3 ). We cannot just take the intersection of 
these two automata since the two “components” of the trees recognized by the 
respective automata do not correspond to the same variables. In fact we have to 
cylindrify both automata, that is add to the first automaton a third component 
and to the second automaton a new first component that are in fact completely 
ignored by the rules of the automata (the origin of the name “cylindrification” 
is the intuition of a two-dimensional geometrical figure that is slided along the 
third dimension through space). Finally we can take the intersection of the two 
cylindrified automata. 

Here lies the problem when we ask for a logical characterization of tree au- 
tomata with tests: The closure properties of the logic, including the “implicit” 
closure by cylindrification that simply stems from the presence of variables in 
the logic, require any class of automata that corresponds to the full class of 
formulae of some logic to be closed under cylindrification. 

If we need cylindrification for our class of automata to correspond to some 
logic then we could simply extend our automaton model accordingly and try 
to show that the “nice” properties of automata still hold. The automata model 
that we get from Tree Automata with Tests (TAT) when we close under cylin- 
drification are Tree Automata with Component-wise Tests (TACT) that will be 
formally defined in Section [3l Unfortunately, emptiness of TACT-automata is 
undecidable as we will show in Section 01 that is, our original program simply 
has to fail. 

Hence we propose to ask for a subset of the set of formulae of some suitable 
logic that is equivalent to our class of formulae. We define in Section such a 
logic WS2Sy and its two subclasses of so-called restrieted and uniform formulae. 
We state the exact correspondence between TACT and TAT automata and the 
subclasses of restricted, resp. uniform formulae of WS2Sy in SectionEland prove 
this correspondence in Section 0 and |H1 The technical part of the paper only 
considers automata with tests between brothers, we briefly discuss extension to 
other classes of automata with tests in Section El 

2 Preliminaries 

A signature S consists of a set of symbols (also denoted E) and an arity function 
a: E ^ Nq. Often one writes for {f £ E \ a{f) = n}. For a signature E 
the set T{E) of E ground terms is the smallest set such that if / G and 
ti, . . . G T{E) then /(ti, . . . ,tn) G T{E). The set of oeeurrences Oft) of a 
tree t = f{ti, . . . ,tn) is {e} U {y(^i{iw\w G 0{ti)}. If tt G 0{t) then the subtree 
t It of t at 7T is defined by t |e= t and /(ti, . . . ,tn) \i-K= U |t- 

Given signatures E, T, a renaming p is a function p: E T such that 
ct{p{f)) E Oi{f) for all f G E. A renaming p: E ^ T extends to a tree homo- 
morphism p:T{E) T{T) by p(/(ti, . . . , t„)) = (p(/))(p(U), . . . , p(t„(p(/)))) 
(hence, arguments not needed in T are simply dropped). 

An instance of the Post Correspondence Problem (short: PCP) is a finite 
sequence P = {(j>i,qi))i=i,...,m of pairs of words from {a, 6}*. A solution of such 
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an instance of PCP is a nonempty sequence 1 < < to, such 

that pq • • • = 9ii • ■ ■ ■ According to a classical result due to Post jhj , it is 

undecidable whether an instance of the PCP has a solution or not. 

3 Tree Automata with Component-wise Tests 

Definition 1 (Tuple Signature). For any finite alphabet F and n > 0, the 
signature consists of the set of function symbols {F U {_L})”, where F is a 
new symbol. The symbol (_L . . . _L) is a constant, all other symbols are binary. 



Example 1. Let F = {0,1}. The signature F^‘^'> contains the binary symbols 
00, 01, 0_L, 10, 11, 1_L, _L0, _L1 and the constant _L_L. 

Definition 2 (Projection and Component of a Tree). Let T be a F^^'^-tree 
and 1 < i < n. 

1. The i-th component of t, denoted F , is the F^^^-tree given by the renaming 
{fi---fi---fn)^ fi for all fj G FU |_L}. 

2. The i-th projection o/t, denoted is the -tree given by the renam- 

mg (/i ■ • ■ fi-ifxfx+i ■ ■ ■ fn) ^ {fi ■ ■ ■ fi-ifi+1 ■■■fn) for all fj G FU |_L}. 

Note that both projection and selection of a component can change the set 
of occurrences of a tree since a binary symbol may become a constant, as is 
illustrated in the example given on Figure 111 




10 



11 FT 



FT FF 



1 



F 1 



F F 



(a) Tree r 



(b) (c) 



Fig. 1. Example of projection and component. 



Definition 3 (Tree Automata). A tree automaton with component- wise tests 
between brothers (short: TACT) is a quintuplet {F,n,Q,Q f, A) where 

— F is a finite alphabet and n > 1, 

— Q is a finite set of states, 
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— QfCQ is the set of accepting states, 

— A is a set of transition rules of one of the following forms: 

• (J -L) ^ g, 9 e Q- 

• /(gii 92 ) 9 where c is a boolean combination of tests l.i = 2.i with 

!<*<«,/€ - (J L), 9,91,92 G Q- 

A tree automaton with tests between brothers (short: TAT) is a TACT au- 
tomaton {r,n,Q,Qf, A) where all tests of rules in A are either True, or 1.1 = 
2.1 A ... A l.n = 2.n, or 1.1 ^ 2.1 V ... V l.n ^ 2.n. 

A tree automaton (short: TA) is a TACT {T,n,Q,Qf, A) where all tests of 
rules in A are True. 

Note that in a TAT all tests operate uniformly on all components. Hence 
TAT’s are exactly the tree automata with tests between brothers introduced 
in [T], and TA are classical tree automata. In the examples we will simply drop a 
test True on a transition rule and write it as the rule of a classical tree automaton. 

Definition 4. The TACT A tests only on the set of components / if whenever 
A contains a transition rule with a test c and l.i = 2.i, resp. l.i 2.i is an 
atomic test in c, then i G I . 



Definition 5 (Acceptance by a TACT). A tree t satisfies the test l.i = 2.i 
(if (I |i)* = (t I 2 )*, this extends in the canonical way to boolean combinations of 
tests. 

The rewriting relation of a TACT A = (T, n, Q, Qf, A) is the smallest binary 
relation r(T^”)) x Q with 

— a q if a ^ q G A 

- f(ti,t2) 9 *//( 9 i , 92 ) q G a, f{ti,t2) satisfies c, h 9i and 
t2 ^A 92 - 

The automaton A accepts a tree t if t — 9 for some accepting state q G Qf. 
The set La recognized by A is the set of all trees accepted by A. 



Definition 6 (Deterministic and Complete TACT). A TACT A is deter- 
ministic if for every tree t there is at most one state 9 such that t -i-a q, and 
complete if for every tree t there is at least one state 9 with t -^a 9 - 

The following proposition is immediate from our definitions: 

Proposition 1. Every TA-recognizable set is TAT-recognizable, and every TAT- 
recognizable set is TACT-recognizable. 
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4 The Emptiness Problem for TACT 



Theorem 1. The emptiness-problem for TACT automata (even when restricted 
to the class of TACT without negative constraints) is undecidable. 

Proof: We reduce the Post Correspondence Problem to the emptiness problem 
of TACT automata. Let P = ((pi, be an instance of PCP. 

First some terminology: a tree t G is a comb if every binary node 

of t has the constant (_L . . . _L) as its left son. For example, the tree of Figure |2] 
is not a comb but its right subtree is. For a comb t G T{T^'^'>) and 1 < z < n, 
the i-th word coded by t is the word t T* defined by 

t\=h 

\c-t \2liAT = c{T,t {2) 

For instance, if t is the right subtree of the tree depicted in Figure[2]then t |i= 
aabb and t J.3= fa. 

We define the signature of our automaton as {/, a, b}‘'^\ The automaton Lp 
is constructed in three steps: 

The first step is to construct an automaton that accepts a comb if it encodes 
two pairs of words such that the second pair is obtained from the first pair by 
application of one step of the PCP instance P. For reasons that will be apparent 
in the last step this automaton comes in a positive and in a negative version that 
differ in the choice of the components that code the two pairs. More exactly: 

For any word v G T*, let A\;^ for i ^ j he the automaton that accepts t iff 
t is a comb, t |j= fw and t lj= wv for some w G T* . li v = vi ■ ■ ■ Vn then the 
rules of the automaton are (here for z = 1 and j = 2), where _ stands for any 
symbol from {a, 6, /, _L} and a, j3 for any symbol from {a, b}: 



(FAFF) ^ qi 
(FFFF) ^ 
(FF__)(®,g„) ^ Qn 
{±v,-){qi,q^) qi-i 
(/ui~)(®,gi) ^ go 
(aui__)(®,gi) ^ qa 
(a/3__)(®,g^) ^ qa 
(//3-)(®,'7/3) ^ qo 



z = 2, . . . , n 



The only accepting state is go- 

From the boolean closure of (classical) tree automata we get an automaton 
Ap that accepts f iff t is a comb and for some z we have that t |i= fw\, 
t (2= fw2, t J,3= wiPi and t (4= W2qi- Analogously, the automaton Af, accepts 
t iff t is a comb and for some i we have that t |i= wiPi, t (2= W2qi, t (.3= fwi 
and t J,4= fw2- Let Q+ resp. Q~ be the accepting states of these two automata. 
In case of the tree t of Figure [21 for example, Ap accepts t |i2 and Af> accepts 
t b- 
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The second step is to construct automata that in addition test whether the 
newly constructed pair is of the form (w, w) for some non-empty word w. We can 
easily derive from Ap this automaton AEp (with set of accepting states Qi) 
that emulates Ap and verifies in addition that t (, 3 = t J, 4 y^ e, and analogously 
AEp (with set of accepting states Qz) that emulates Ap and verifies in addition 
that t |i= t [ 2 ^ e. In case of the tree t of Figure |2] AEp accepts t 12 - 

Note that the first two steps used only classical tree automata. The third 
step is now to build the final automaton which uses these four automata as 
sub-automata (we hence assume their states-spaces to be disjoint) to accept a 
solution sequence of P. This is where equality tests come in. 



(TTTT) 


p± 






p+ 




iffff){p^,q^) - 


_^l.l=2.1A1.2=2.2 p- 


q+ eQ+ 


{ffff){p~,q~) - 


_^1.3=2.3A1.4=2.4 p+ 


q- G Q- 


iffff){p^,q^) - 


, 1.1=2.1A1.2=2.2 „ 

-A Pf 


q+ G Qt 


iffff){p~,q~) - 


_^1.3=2.3A1.4=2.4 


q G Q= 



where Pf is the only accepting state. An example of the representation of the 
solution of an instance of PCP as accepted by this automaton is given in FigureO 
In this example, the accepting run assigns the occurrence 11 the state p'^, the 
ocurrence 1 the state p~ and the occurrence e the accepting state p/- □ 

The undecidability result holds even in case of signatures with only two 
components. For a proof, it suffices to encode in the above proof the first and 
second, respectively third and forth component of the signature into one symbol. 



5 The Logic WS2Sy 

The logic WS2Sy is a two-sorted logic with the sorts w (for word) and s (for 
set). We will write variables of sort w in lower case and variables of sort s in 
upper case. Function symbols are a constant e of sort w and two monadic function 
symbols 1 and 2 of profile w ^ w. We write applications of these function symbols 
in postfix notation, that is a;1121 for l(2(l(l(o:)))). Relation symbols are G of 
profile w X s, and for every n > 1 a n -I- 1-ary predicate Sy of profile w x s x . . . x s. 

The standard interpretation of WS2Sy assigns the sort w as universe the set 
{1,2}* and the sort s as universe the set of all finite subsets of {1,2}*. The 
constant e is interpreted as the empty word, and the function 1 (resp. 2) as the 
function that appends 1 (resp. 2) to a word. The predicate symbol G denotes 
set membership. The synchronisation predicate Sy{w, Mi, . . . , M„) holds if for 
all 1 < i < n and all x G {1, 2}* we have that wlx G Mi iff w2x G Mi. 

The notion of a free occurrence of a variable in a formula is defined as usual. 
A WS2Sy-iovT[m\& (f is called restricted if whenever (f> contains a subformula of 
the form Sy{x, Xi, . . . , Xn) then Xi, . . . , Xn are free variable occurrences (here. 
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±±±± ±±±± ±±±± ±±±a ±±±± bbLa 




±±±± ±±±6 ±±±± bbLb 




±±±± ±±±± ±±±± ±±±± 



Fig. 2. Example of the representation of the solution ((e, e), (a, aab), (aabb, aabb)) 
of the PCP instance {{a, aab), {abb,b)). 



there is no restriction on x being free or bound). A FES'^S'?/- formula (j) is called 
uniform if whenever (j) contains a subformula of the form Sy{x, Xi, . . . , X„) then 
Xi,. . . , Xn are exactly the free variable occurrences of (j) (hence, x is bound). As 
a consequence, every uniform formula of WS2Sy is restricted, and the classical 
logic WS2S is obtained from WS2Sy by omiting the synchronisation predicates. 

The logic WS2Sy allows to define the usual set-theoretic notions as union, 
intersection, singleton sets, etc. that we will use freely in the sequel. Furthermore, 
the prefix relation on words (denoted a; < y) is expressible in WS2S as has been 
demonstrated in Section [T] hence in WS2Sy. 

Following the classical translation of the logic WS2S into tree automata, 
we define the one-sorted version WS2SyQ of WS2Sy where only set- variables and 
predicates on sets are used. The predicates of WS2SyQ are Xi = X 2 I, Xi = X 22 , 
Xi = e, Xi C X 2 , and Sy{Xo, Xi, . . . , Xn)- In the standard interpretation of 
WS2SyQ we have that Mi = M 2 I is true iff M 2 is some singleton set {m 2 } and 
Ml = {m 2 l} and analogous for Mi = M 22 ; Mi = e is true iff Mi = {e}; C is set 
inclusion; and Sy{Mo, Mi, . . . , Mn) is true in WS2SyQ iff Mq is some singleton 
set {too} and Sy{mo, Mi, . . . , Mn) is true in WS2Sy. We can now translate in 
a straightforward way any formula ^{xi , . . . , Xn, Yi, . . . , Y^) of WS2Sy into a 
formula <^ 0 (^ 1 , . . . , Xn, Yi,. . . , Ym) of WS2SyQ such that 



Predicate Logic and Tree Automata with Tests 



337 



U>1 , . . . , Ml , . . . , Mm \= <P iS {wi} m {Wn} , Ml, , Mm |= 

and any formula ^{Xi, . . . , Xn, Yi, . . . , Ym) of WS2SyQ (for an appropriate enu- 
meration of the variables) into a formula , Xn, Yi, . . . , Ym) of WS2Sy 

such that 

Ml, . . . , Mn, Ni, . . . , Nm ^ ^ iff exists rrii such that Mi = {rrii} for all i and 

mi,...,mn,Ni,...,Nrn \= 

We say that a set S of n-tuples of sets of words over {1,2} is definable in 
WS2Sy (resp. restricted WS2Sy or uniform WS2Sy) if there is WS'^S'j/-formula 
(resp. restricted W5'^S'j/-formula or uniform WS'^S'j/-formula) <l>{Xi , . . . , Y„) such 
that 



{Ml,..., Mr,) G ^iffMi,...,M„ 

According to the translation between WS2Sy and WS2SyQ that we have men- 
tioned above it does not matter whether we take WS2Sy or WS2SyQ to define 
the notion of definability. 

6 Recognizability and Definability 

Before we can formulate the correspondence between definability in (subclasses 
of) WS2Sy and recognizability by (subclasses of) TACT we need translations of 
tuples of finite sets of words into trees, and vice versa. 

Definition 7. The tree representation of a finite set W C {1,2}* is the tree 
W G r({0, 1}^^^) defined by 

r T if W = H} 

W = < 0(l-iW,2-ilY) ifW^l!)ande^W 
[ l(l-ilY,2-ilY) ifW andeGW 

where 1“^1Y = {w \ Iw G W} and 2“^1Y = {w \ 2w G W}. 

The tree representation (Wi,...,W„) of an n-tuple {Wi,...,Wn) of finite 
subsets o/{l,2}* is the tree t G T({0, 1}^"^) with = Wi for all i. 

The tree representation S of a set of tuples of finite subsets of {1,2}* is 
{(lYi,...,W„) I (Wi,...,lY„) G S}. 

Note that the tree representation of a tuple of sets is uniquely defined since 
(T . . . T) is a constant. 

Example 2. The tree representation of the set {e, 2} is the right tree of Figured] 
The tree representation of the triple of sets ({e, 1}, {1}, {e, 2}) is the left tree 
of Figure |T] 
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Theorem 2. Let S be a set of n-tuples of words. 

1. If S is definable in restricted WS2Sy then S is TACT-recognizable. 

2. If S is definable in uniform WS2Sy then S is TAT-recognizable. 

3. If S is definable in WS2S then S is TA-recognizable (fU^). 

The proof of the first two items of this theorem is subject of Section [7l the 
last item is due to [TO) . 

Definition 8. The tuple representation ((f)) of a tree t € for finite T, 

say r = {ci, . . . , Cm}, is the n{m + l)-tuple of finite subsets of {1, 2}* 

(Ml, Ml, . . . , Mi, . . . , Mo", Mf , . . . , Mi) 



defined by 

Mq = {w I the i-th component of the root oft i is _L} 

M* = {w I the i-th component of the root of t i is cj} 

The tuple representation {{S)) of a set of trees in is {{{t)) | t € 5}. 



Example 3. The tuple representation of the middle tree of Figure [His 
({ll,12,2},{},{e,l}, {ll,12,2},{e},{l}) 

Theorem 3. Let S C 

1. If S is TACT-recognizable then {{S)) is definable in restricted WS2Sy. 

2. If S is TAT-recognizable then {{S)) is definable in uniform WS2Sy. 

3. If S is TA-recognizable then {{S)) is definable in WS2S 

The proof of the first two items of this theorem is subject of Section [HI the 
last item is due to m- 

7 Translating Formnlae to Automata 

In this section we prove Theorem [7] by translating a restricted WS2SyQ-foTT[mla, 
(L{Xi , . . . , Xn) into a TACT-automaton over {0, 1}^") such that is a TAT 
when <P is uniform, and that A^, is a classical tree automaton when ^ is a WS2S 
formula. 

The first step is to construct automata for the atomic formulae of WS2SyQ. 
Automata for the atomic formulae Xi = A 2 .O, Xi = A 2 .I, Ai = e and X\ C X 2 
can be found in the literature (for instance [1]). It remains to give an automaton 
for the synchronization predicates: 



Predicate Logic and Tree Automata with Tests 



339 



The automaton An is defined as ({0, 1}, n, {go, gi}, jgi}, A) where A contains 
the rules 





(T...T) 


^ 90 


(T_. 


■ -)(9o,go) 


^ 90 


(0_. 


■ -)(9o,go) 


^ 90 


(1- 


■ -)(9o,go) - 


^1.2=2.2A...Al.n=2.n „ 
-A gi 


(0_. 


■-)(9o,9i) 


^ 9l 


(0_. 


■ -) ( 91 , 90 ) 


^ 9l 



Proposition 2. Let S'" be the set of all n-tuples of sets of words sueh that 
S ^ Sy{Xi, X2 , . . . , Xn)- The language reeognized by An is Sn- 

In the case of the equivalence between WS 2 S and classical tree automata 
the translation of an arbitrary formula is obtained once the closure of the class 
of recognizable sets by the boolean operation, cylindrification and projection 
is established. In case of WS 2 Sy and TACT we need, however, a more specific 
property of TACT-recognizable sets than just closure by certain operations: The 
closure must not affect the set of components on which the automaton performs 
tests. This invariant of the closure operations is needed because in the closure 
by projection of the Tth component we need as a hypothesis that the automaton 
does not test on the i-th component. An even stronger invariant will be needed 
to prove that uniform WS 2 Sy translates to TAT automata. 

Proposition 3. If L is recognizable by a TACT A that tests only on the set of 
components I then there exists a complete and deterministic TACT that tests 
only on I and that recognizes L. 

Proof: Completion is achieved by adding a sink state. The determinisation is 
the same as in |^. Since all rules in the determined automaton have tests that 
are boolean combinations of tests in the original automaton the set of tested 
components is not touched by this operation. □ 

It should be remarked that the determinisation of a tree automaton with 
tests requires the set of constraints to be closed under negation (see HD- 

Proposition 4. If I is recognized by a TACT A that tests only on the set of 
components I then there exists a TACT that tests only on I and that recognizes 
T{r^-^)) - 1 . 

If Li, L2 are recognized by TACTs Ai, A2 that test only on the set of compo- 
nents I then there exists a TACT that tests only on I and that recognizes L1UL2, 
resp. LiD l2- 

Proof: This is now straightforward (again, see ^). We just remark that the 
closure of the language of constraints by conjunction is used in case of the inter- 
section. □ 
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Definition 9. Let L be a set of -trees. For 1 <i <n the i-th projection of 
L is 

3i:L = \ teL} 

For 1 < i < n + 1, the i + 1-th cylindrification of L is 

li:L={t€ I e L} 



Proposition 5. If L is recognizable by a TACT A that tests only on the set of 
components I then there exists a TACT that recognizes \i \ L and tests only on 

{j I j < 0 U {j + 1 I j > i}- 

Proof: Let L be recognized by the TACT (T, n, Q, Qf, A). To ease notation we 
show the construction for the case i = n -I- 1. It is straightforward to show that 
n -I- 1 : L is recognized by the TACT {F, n 3- l,Q,Qf, A') where 

Z\' = {(T-.-TT)^g| (T.--T)^gGZi} 

u {(//)(?!, 92 ) q I /(gi, 92 ) 9 G Z\,/ yf T, / T} 

U {(l/)( 9 i, 92) ^ 9 I (A • • • T) ^ 9 e A, / y^ T} 



□ 

Proposition 6. Let L C be recognized by a TACT A that tests only on 

the set of components I and i ^ I . There exists a TACT that recognizes 3i : L 
and tests only on {j \ j < i} U {j — 1 | j > i}. 

Proof: Let L be recognized by the TACT A = {F,n + l,Q,Qf,A). To ease 
notation we show the construction for the case z = n -I- 1. 

We can assume w.l.o.g that A is reduced, that is that there is for every q G Q 
a term t G T{F^^'>) such that t — q (this can simply be obtained by dropping 
all states that recognize the empty set). We construct an automaton for 3n : L 
as A = {F, n,Q,Qf, A') where 

A' = {(T..-T)^g| (T.-.TT)^gG A} 

U {(T • • • T) ^ 9 I (T • • • T/)(gi, 92 ) ^ q G AJ ^ L} 

U {/(9i,92) 9 I //(9i, 92) 9,/y^ A} 

The proof that A accepts 3n -|- 1 : L is exactly as in the case of classical tree 
automata (see, e.g., P]). The restriction that the automaton A does not use a 
test on the n -I- 1-th component ensures that the constraints pose no problem 
here. □ 

Proof of Theorem |2 Let a restricted formula be given. We can assume 
without loss of generality that every bound variable of has exactly one binding 
quantifier and that no variable of has both bound and free occurrences in <P. 
We construct inductively an automaton for all subformulae of <T. 
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We get an automaton for atomic formulae as in the case of classic tree au- 
tomata, resp. by Proposition[2l Then we cylindrify to have components according 
to all free and bound variables of 'P. This yields, by Proposition!^ an automaton 
which does not test on components corresponding to bound variables. In case of 
composite formulae we conclude by Propositions |4] and 

If the formula is uniform then the automata corresponding to the atomic 
formulae have (after cylindrification) only tests True, 1.1 = 2.1 A ... A l.n = 
2.n and 1.1 yf 2.1 V ... V l.n yf 2.n, where we assume for simplicity that the 
components corresponding to the global variables are 1, . . . , n. Since all closure 
operations on automata construct tests that are boolean combinations of tests 
of the input automata, the final automaton too has only tests of this form. Since 
the final automaton works on the components 1, . . . , n it is indeed a TAT. □ 



8 Translating Antomata to Formnlae 

In this section we prove Theorem (31 by translating a TACT-automaton A = 
(T, n, Q, Qf, A), where T has cardinality m, into a formula <P with n(m+ 1) free 
variables. 

Let r = {ci, . . . , Cm}, we will use in addition the convention cq = T. Let Q = 
{qi, . . . ,qq}. We can assume w.l.o.g. that all constraints of the productions of 
A are conjunctions of equations and inequations since a boolean combination of 
constraints can be brought into an equivalent disjunctive normal form and a rule 
/( 9 I) 92 ) — q I'ules /( 91 , 92 ) — 9 and /( 91 , 92 ) 9 . 

First we define some auxiliary formulas: The formula partition{Y, Yi, . . . , Yk) 
states that (Yi, . . . , Y^) is a partition of Y : 



partition{Y, Yi, . . . , Y^) = Y = Yi U . . . U Y^ A Y n = 0 

The formula term{X,Xi, . . . ,Xn(m+i)) states that, for some tree t G 
X is the set of occurrence of t and (ATi, . . . , X„(m-i-i)) = {{t))- 

tC-TTni^X^ Xi^ ■ ■ ■ ; ^n(m-t-l)) pO,TtitioTl(^X 7 ■ ■ ■ 5 

AVa;, y{x <yAyGX^xGX) 

AWxix G X^n...nX^ ^ X Axl^ X)) 
AWxix ^ x^n...nx^ ^ {xOG X AxlG X)) 



The first clause of the formula term says that each component of every occurrence 
is marked with exactly one symbol and that X is the set of occurrences, the 
second clause says that the set of occurrences is prefix-closed, the third clause 
says that any occurrence marked (T ... T) is a leaf and the forth clause that every 
occurrence not marked with (T . . . T) has two subtrees. Finally, the formula is 
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<^A{xlxl . ,Xo",xr, ...,X^) = 3x,yi, . . . ,ife( 

term{X,Xl...,X:^) 

A partition{X, Yi, . . . , Yg) 

A \J e G Yi 

gi^Qf 

A yx{x G Xq n . . . n Xq \J x gYi) 

(_L---±^9iGZi) 

A /\ ^x{x G Y,, n . . . n ^ \J ( 

X GYi A xl GYj A x2 GYk 

A /\ Sy{x,Xl,...,Xl^)A /\ ^Sy{x,X^„...,X:^)))) 

1.2=2.iGc l.i^2.i£c 

In this formula, the first conjunct says that the free variables are the encoding 
of a Y^”)-tree with set of occurrences X . The second conjunct says that every 
occurrence of the tree is assigned a state. The third conjunct expresses that the 
root of the tree is assigned an accepting state, and the last two conjuncts express 
that the assignment is according to the transitions rules of the automaton: The 
forth conjunct covers the case of the constant (T . . . T) and the complex last 
conjunct the case of a binary function symbol. 

Note that the formula (pA is restricted. If the automaton is in fact a TAT 
automaton then the formula (pA is equivalent to a uniform MAS'^5'?/-formula since 

^2/(a:,A*,...,A;,)A%(a:,Ai],...,A4) 

is equivalent to the formula 

Sy{x,X^„...,Xi^,Xl,...,X}^) 

Finally, if the automaton A is a classical tree automaton then pA contains no 
synchronization predicate and is hence a IF5'^5'-formula. 



9 Other Classes of Automata with Tests 

So far we have only considered the class of automata with tests between broth- 
ers introduced in [Ij. We can easily extend the results of this paper to classes 
of automata with “deep” tests that can perform tests like 11. z = 222. i, where a 
tree t satisfies this test if {t |n)^*^ = {t 1222 )^*^ jS]. This generalization is easily 
achieved by defining a generalization of WS2Sy with stronger synchronization 
predicates. These classes of automata are, however, of limited interest since al- 
ready the class of automata with tests between cousin positions (that is p.i = q.i 
where p and q have length 2) has an undecidable emptiness problem [^. 

There is a very interesting class of tree automata with tests that has a de- 
cidable emptiness problem: the class of reduction automata [3] and generalized 
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reduction automata [2]. These automata allow “deep” (in the above sense) tests 
but come with a syntactic restriction that ensures that on every branch of a tree 
only a bounded number of equality tests can be performed. Hence, the undecid- 
ability proof of this paper does not directly apply to a generalization of these 
automata to component-wise tests since we used the fact that we can perform an 
unbounded number of equality tests on the left spine of the tree (see Section SI). 
Hence, it is still possible that the results known for (generalized) reduction au- 
tomata can be lifted to component- wise tests, and that a corresponding logic 
with a restriction on the set of admissible formulae can be found. 
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Abstract. In the compositional verification of a concurrent system, one 
seeks to deduce properties of the system from properties of its constituent 
modnles. This paper supplements our previous work on the same sub- 
ject to provide a comprehensive compositional framework in linear-time 
temporal logic. It has been shown by many that specifying properties 
of a module in the assumption-guarantee style is effective in achieving 
compositionality. We consider two forms of temporal formulas that cor- 
respond to two interpretations of an assumption-guarantee specification 
and investigate how they can be applied in compositional verification. 
We argue by examples that the two forms complement each other and 
both are needed to facilitate the compositional approach. We also show 
how to handle assumption-guarantee specifications where the assumption 
contains a liveness property. 



1 Introduction 

A concurrent system typically is or can be decomposed as the parallel composi- 
tion of several modules. In the compositional verification of a system, one seeks 
to deduce properties of the system from properties of its constituent modules. 
We assume that the system to be verified is closed, i.e., the system is meant to 
be executed in isolation (without any interferences, except perhaps part of the 
initialization, from the environment). Nonetheless, we provide sufficient details 
showing how the results of this paper can be extended straightforwardly to the 
compositional verification of an open system, which is essentially a module. 

Properties of a system are represented by assertions on computations of the 
system and so are properties of a module. Computations of a system are the 
sequences of states produced when the system is executed in isolation. In con- 
trast, computations of a module are the sequences of states produced when the 
module is executed in parallel with an arbitrary but (syntactically) compati- 
ble environment, i.e., the computations of an imaginary system obtained from 
composing the module with the arbitrary environment. A system or module 

* This research was supported in part by grants NSC 86-2213-E-002-002 and NSC 87- 
2213-E-002-015 from the National Science Council, Taiwan (R.O.C.) and a research 
award from College of Management, National Taiwan University. 

J. Tiuryn (Ed.): FOSSACS 2000, LNCS 1784, pp. 344- fHHHl 2000. 

(c) Springer- Verlag Berlin Heidelberg 2000 
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satisfies a certain property if the corresponding assertion holds for each of its 
computations. 

A module will behave properly only if its environment does. When specify- 
ing properties of a module, one should therefore include (1) assumed proper- 
ties about its environment and (2) guaranteed properties of the module if the 
environment obeys the assumption. This type of specification is essentially a 
generalization of pre and post-conditions for sequential programs M- The gen- 
eralization was adopted in the early 1980’s by Misra and Chandy Jones 
and Lamport [19j and became the so-called assumption- guarantee (also known 
as rely-guarantee or assumption-commitment) paradigm. 

Consider an assumption-guarantee specification with assumption A and guar- 
antee G. There are at least two possible interpretations of the specification over 
a sequence of states. Informally, one interpretation states that G holds at least 
one step longer than A does. The other states that G holds as long as A does, 
which is a weaker interpretation than the first. A third even weaker interpreta- 
tion is the ordinary implication from A to G; however, it is practically equivalent 
to the second interpretation, as a module should not have the ability to predict 
the future behavior of its environment and hence the future violation of A by 
its environment. We refer to properties according to the first interpretation as 
strong assumption-guarantee properties and those according to the second as 
weak assumption-guarantee properties. As has been pointed out by Abadi and 
Lamport |2], if A and G cannot be falsified simultaneously by any step of the 
module or its environment, then the two interpretations are equivalent. 

In this paper, we intend to further advance the use of temporal logic in speci- 
fying and reasoning about assumption-guarantee properties and investigate how 
this kind of properties can be applied in compositional verification. Temporal 
logic is one convenient formalism for specifying the behavior of a concurrent 
system. The idea of representing concurrent systems and their specifications as 
formulas in temporal logic was first proposed by Pnueli m- 

We have proposed in 1161 to formulate assumption-guarantee specifications 
using the linear-time temporal logic (LTL) of Manna and Pnueli |^. We showed 
how to specify and reason about strong assumption-guarantee properties in the 
full set of LTL. Our formulation of assumption-guarantee specifications as well 
as the derived composition rules are syntactic and entirely within LTL. This pa- 
per complements and differs from our previous work in three aspects: First, we 
consider both strong and weak (not just strong) assumption-guarantee properties 
in this paper. Second, we emphasize the use of assumption-guarantee specifica- 
tions in compositional verification rather than hierarchical development. Last, we 
extend the previous work to include composition rules that permit assumptions 
with liveness properties. Hiding (of local variables) is not treated here for a more 
focused exposition; it can be handled syntactically in the same way as in our pre- 
vious work. Together this paper and the previous work provide a comprehensive 
compositional framework in LTL. 

Related works on assumption-guarantee specifications, including psjuanni 
[niiiiEiisiEiiiniEZj, typically reason about relevant properties at the semantic 
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level or define a special-purpose logic. In PQ, Abadi and Lamport gave a compre- 
hensive treatment of compositionality in a general semantic setting with agents 
(which are used essentially for identifying a module). Their semantic composition 
rule used the notion of the “realizable part” of a specification which in general 
cannot be extracted by simpler operations on the specification. Xu, Cau, and 
Collette [23 provided an explanation of the difference between two well-known 
composition rules respectively for message-passing and shared- variable models. 
They show that the two rules can be derived from a more general one. In [4], 
Alur and Henzinger suggested the notion of local liveness in place of the weaker 
notion of receptiveness involved in the compositionality issue. They argue that 
receptiveness is unnecessarily weak and computationally hard to check, while 
local liveness on the other hand is satisfied by most existing models and is easier 
to check. A collection of survey papers on the general subject of compositional 
verification has recently been published as m- 

Barringer and Kuiper are, to our knowledge, the first to formulate 
assumption-guarantee specifications in temporal logic. They used the notion of 
an agent and considered only strong assumption-guarantee properties. Manna 
and Pnueli proposed a compositional verification rule using weak assumption- 
guarantee properties in their recent book (22]. Using the Temporal Logic of 
Actions (TLA, a variant of temporal logic) [20], the work of Abadi and Lam- 
port 12] is an improvement over earlier temporal logic-based works in handling 
hiding and liveness properties. They focused on assumption-guarantee specifica- 
tions where the assumption and the guarantee cannot be falsified simultaneously. 
With a limited set of temporal operators in TLA, they had to work mostly at 
the semantic level. Abadi and Lamport’s formulation of an assumption-guarantee 
specification allows liveness properties in the assumption part. However, their 
composition rule only works for safety assumptions. Collette m, adapting the 
work of |2], proposed a UNITY-like |7] logic for assumption-guarantee specifica- 
tions with restricted forms of assumption and guarantee. 

Assumption-guarantee specifications have also found applications in the area 
of model checking [8]. They are useful for compositional (or modular) model 
checking, which provides one possible way to tackle the state-explosion problem. 
Virtually all existing works on modular model checking are for branching-time 
temporal logic or a combination of linear-time and branching-time logics. Grum- 
berg and Long m considered a subset of CTL (Computation Tree Logic, a 
branching-time temporal logic) for which satisfaction is preserved under parallel 
composition. In their work, the assumption of the specification of a module is 
represented by another abstract module; the composition of the two modules is 
then checked against the desired property. In [5], Aziz et. al. proposed to reduce 
the size of each module of a system via an equivalence so that the given specifi- 
cation is preserved. Their method handles full CTL. The complexity of modular 
model checking of CTL formulas has been shown to be at least as high as that 
of (propositional) LTL formulas by Kupferman and Vardi P^HTIITH] . 
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2 Preliminaries 

2.1 Temporal Logic 

Linear-time temporal logic (LTL) is a logic for expressing assertions on infinite 
sequences of states, where each state is an assignment to a predefined universe 
of variables. An LTL formula is interpreted with respect to a position i > 0 in a 
sequence of states. State formulas are the basic type of LTL formula built only 
from variables, constants, functions, and predicates using the usual first-order 
logic connectives. The interpretation of a state formula in position i is performed 
as usual using the particular interpretation of variables in state i (plus the fixed 
interpretations of constants, functions, and predicates). General LTL formulas 
also contain temporal operators; in this paper, we will use only the following: 

— O means “in the next state” . The formula Oip is true in position i of a 
sequence a (denoted (cr, f) ^ O^) iff is true in position f -|- 1 of cr (i.e., 
(CT,i+ 1) h ^)- 

— □ means “always in the future (including the present)”; (ct, z) |= Wp iff 
yk > i : (a, k) |= p. 

— © means “in the previous state, if there is any”; (cr, z) |= Qp iff (z > 0) ^ 
((cr, z — 1) 1= p). © is a weaker version of ©, which means “in the previous 
state”; (cr, z) |= Qp iff (z > 0) A ((cr, z — 1) |= p). It follows that (cr, z) ^ Qp 
iff (cr, z) ^ -^Q^p. 

— E means “always in the past (including the present)”; (cr, z) |= iff Vfc : 
0 < k < i : {a, k) \= p. 

— For a variable u, the interpretation of u~ (the previous value of u) in position 
z is the same as the interpretation of variable u in position z — 1; by conven- 
tion, the interpretation of u~ in position 0 is the same as the interpretation 
of u in position 00 

— first is an abbreviation for Qfalse, which is true only in position 0. 

We say that a sequence a satisfies a formula p (or p holds for cr) if (cr, 0) \= p. 
A formula p is valid, denoted \= p or simply p when it is clear that validity is 
intended, if p is satisfied by every sequence. 

A formula without temporal operators but possibly with ““’’-superscribed 
variables is called a transition formula; this definition is slightly different from 
that in m, where a transition formula always contains -^first as a conjunct. A 
formula without any future operator ©, □, or O (though liveness is considered, 
O is not explicitly used in this paper) is called a past formula; in particular, 
a transition formula is a past formula. A safety formula is one that specifies a 
safety property and a liveness formula is one that specifies a liveness property. Of 

^ In contrast to Lamport and others who use “^’’-superscribed (or primed) variables 
to denote their values in the next state, we use ““’’-superscribed variables to denote 
their values in the previous state. The reason is that (for conformity) we wish to use 
only past operators, except the outmost □, in the safety part of a specification. The 
introduction of ““’’-superscribed variables is convenient but not essential, since they 
can be encoded by the © operator. 
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local a, b : integer where a = b = 0 





loop forever do 




loop forever do 


Pa :: 


[a 6 -1- l] 


1! Pb ■■■■ 


[6 := a + l] 



Fig. 1. Program keep- ahead. 



particular importance, formulas of the form nH, where H is a, past formula, are 
for certain safety formulas; they will be referred to as canonical safety formulas. 
Specific forms of liveness formulas are not important for our purposes. Formulas 
of the form nH /\ L, where H is a past formula and L a liveness formula, will 
be referred to as canonical formulas. 

2.2 Specifying Concurrent Systems 

A concurrent system consists of a set of variables, an initial condition on the 
variables, and a set of transitions that specify how the system may change the 
values of its variables in an execution step. Semantically, a concurrent system is 
associated with a set of computations or sequences of states, each of which rep- 
resents a possible execution of the system. We will mostly concentrate on safety 
properties of a system. For our purpose, we distinguish two kinds of specification: 
system specification and requirement specification. 

System specifications are basically programs in the form of a temporal for- 
mula. Consider Program keep- ahead in Figure [U The system specification of 
KEEP-AHEAD is given by ?^keep- ahead as defined below. 

/ (a = 5“ -I- 1) A (6 = b~) \ 

^KEEP-AHEAD = (a = 0) A (6 = 0) A □ j V (6 = -|- 1) A (a = a~) 

\ V (o = a“) A (6 = 6“) / 

The formula ^keep- ahead states that the values of a and b are initially 0. 
It also states via the disjunction of three transition formulas that, in each step 
of an execution, either the value of a becomes 6-1-1 (while the value of 6 is 
unchanged), the value of 6 becomes a -I- 1, or nothing is changed. The transition 
formula {a = a~) A {b — b~) is called a stuttering transition and is included to 
make the specification invariant under stuttering. 

We regard system specifications as formal definitions of concurrent systems 
so that we can do without a formal semantics of the programming language; 
programs are informal notations for readability. To take fairness into account, 
one may conjoin an appropriate liveness formula to the system specification. 
The safety formula in a system specification can be put in the canonical form 
of nH, specifically in the form of n{{first A Init) V {^first A N)), where Init is 
a state formula and N the disjunction of several transition formulas. As N will 
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module Ma 

in b : integer 

own out a : integer where a = 0 



module Mf, 

in a : integer 

own out b : integer where b — 0 



m 



loop forever do 



loop forever do 



[a := 6-1- l] 



[6 := a -I- l] 



Fig. 2. Program keep-AHEAD as the parallel composition of two modules. 

always contain a stuttering transition, n{{first A Init) V {^first A N)) simplifies 
to n{{first A Init) V N). 

Requirement specification is the usual type of temporal-logic specification. 
A property is represented by a temporal formula. A system (program) S is said 
to satisfy a formula (p if every computation of S satisfies p. Let <Ps denote the 
system specification of S. We will regard <I>s ^ ‘P SiS the formal definition of the 
fact that S satisfies v?, denoted as S' |= </3. The safety formula in a requirement 
specification can usually be put in the canonical form. 

2.3 Parallel Composition as Conjunction 

Program keep-ahead can be decomposed as the parallel composition of two 
modules as shown in Figure A module may read but not change the value of 
an in (input) variable. A compatible environment of a module may read but not 
change the value of an own out (owned output) variable of the module. In the 
system Ma || Mb, Mb is the environment of Ma and Ma is the environment of 
Mb] both are clearly compatible with each other. 

The system specifications and I>Mb of modules Ma and Mb respectively 
are defined as follows: 



It is perhaps more accurate to say that is the system specification of an 
imaginary system composed of Ma and an arbitrary but compatible environment; 
analogously, for ■ A little calculation shows that 



This formally confirms that Ma || Mb is equivalent to Program keep- ahead. 

A module M is said to satisfy a formula p if every computation of M satisfies 
p. Let <Pm denote the system specification of M. Like in the case of specifying 





h A ^ ^KEEP- AHEAD- 
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properties of a concurrent system, we will regard ^ as the formal definition 

of the fact that M satisfies ip, denoted as M |= Since parallel composition is 
conjunction, it follows that, if M is a module of system S, then M \= ip implies 
S\^ip. 

3 Assumption-Guarantee Specifications 

We shall concentrate on assumption-guarantee specifications where both the 
assumption and the guarantee are safety properties; liveness will be treated in 
Section El We assume that safety properties are expressed as canonical safety 
formulas of the form nH , where H is, & past formula. 



3.1 Strong Assumption-Guarantee Formulas 

Strong assumption-guarantee formulas specify strong assumption-guarantee prop- 
erties. A strong assumption-guarantee property of a module with assumption A 
and guarantee G asserts the following: 

For every computation of the module, G holds initially and, for every 
z > 1, if A holds for the prefix of length i — 1 (i.e., with i — 1 states), 
then G also holds for the prefix of length i. 

Notice that, if a safety property does not hold for a prefix of a computa- 
tion, then the property will not hold for any longer prefix. The above assertion 
therefore says that G holds at least one step longer than A does. 

As A and G are given respectively as uHa and nHo, where Ha and Hq 
are past formulas, the strong assumption-guarantee property can be expressed 
as □(©Ei?A ^ which is equivalent to n{QBHA — > Hq). Note that 

□ ( © bHa Hq) implies that Hq holds initially, since © bHa always holds in 
position 0 of a sequence. To summarize, we define strong assumption-guarantee 
formulas of the form A t> G as follows: 

A G {i.e., bHa > bHg) = □( © bHa Hq) 

Note that A > G is also a canonical safety formula. 

Theorem 1. Suppose that and Hg 2 are past formulas. Then, 

h {uHg^ > BHg^) /\{bHg2 > BHg,) BHg, ^BHG2■ 

The above theorem is essentially the composition principle formulated by 
Misra and Chandy m . This small result shows that strong assumption-guarantee 
formulas have a mutual induction mechanism built in and hence permit “circular 

^ Since Ha and Hq are past formulas, “oHa holds for the prefix of length i — 1 of 
a” can be formally stated as “(cr, i) \= © bHa” and “oHo holds for the prefix of 
length i of cr” as “(n, i) |= bHg” . 
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reasoning” (there is of course no real cycle if one looks at the semantic models 
and reasons state by state from the initial one), i.e., deducing new properties 
from mutually dependent properties. 

We now state a general rule for composing strong assumption-guarantee prop- 
erties; this rule has been proven in |16| . 

Theorem 2. Suppose that Ai = uHAi, Gi = uHq^, A = uHa, and G = nHc, 
all in the canonical form. Then, 

1. h a(^BHA A B f\ Hg,^ Ha^ forl<j <n 

2. □(© bHa a e a ^ Hg') 

2=1 £ 

n 

h A > G,) ^ (A > G) 

i=l 

Intuitively, Premise 1 of the above composition rule says that the assumption 
about the environment of a module should follow from the guarantees of other 
modules and the assumption about the environment of the entire (open) system, 
while Premise 2 says that the guarantee of the entire system should follow from 
the guarantees of individual modules and the assumption about its environment. 
For closed systems, we take A to be true and simplify the rule as follows: 

Theorem 3. Suppose that Ai = bHah Gi = bHgh and G = bHg, all in the 
canonical form. Then, 

1. h □( Q A HGi Ha^^ for l<j <n 

2. h □(q A Hg, Hg) 

^ 2=1 1 

n 

h A > GO ^ G 

2=1 

Theorem [I] stated earlier, follows immediately from this theorem. 

3.2 Weak Assumption-Guarantee Formulas 

Weak assumption-guarantee formulas specify weak assumption-guarantee prop- 
erties. A weak assumption-guarantee property of a module with assumption A 
and guarantee G asserts the following: 

For every computation of the module, if A holds for some prefix of the 
computation, then G also holds for the same prefix. 

Notice again that, if a safety property does not hold for a prefix of a compu- 
tation, then the property will not hold for any longer prefix. The above assertion 
therefore says that G holds as long as A does. 

With A and G given respectively as bHa and bHg, the weak assumption- 
guarantee property can be expressed as □( bHa bHg), which is equivalent 
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to n{E\HA Hq). Hence, we define weak assumption-guarantee formulas of 
the form A > G as follows: 

A > G (i.e., uHa > uHg) = □( uHa Hq) 

Weak assumption-guarantee formulas lack the kind of mutual induction mech- 
anism built into strong assumption-guarantee formulas and cannot be readily 
composed. 

A Quick Comparison between Strong and Weak Assumption- Guarantee Formu- 
las: For a property uHa > uHg to hold for the computations of a module M, 
no step of an environment compatible with M should be able to falsify both Ha 
and Hg- On the other hand, dHa > oHg does not have this constraint. This 
distinction is further elaborated in Section [3 



4 Compositional Verification 



We present two compositional verification rules: one using strong assumption- 
guarantee formulas and the other using weak assumption-guarantee formulas. 



Theorem 4 (Rule MOD-S). Suppose that Ai, Gi, and G are canonical safety 
formulas. Then, 

Mi \= Ai [> Gi for 1 < i < n 

n 

h A (4, > G,) ^ G 



n 

I! M, h G 

i=l 

The first premise may be established by applying a verification rule for canon- 
ical safety formulas from [221 Chapter 4] (recall that (Ai > Gi) = □( © BHAi 
HGi) is a canonical safety formula), while the second premise may be established 
by applying the composition rule from Theorem |3] or the simpler Theorem [T] If 
all the modules are finite, then both of the two premises may be established by a 
suitable model checker. Nonetheless, Theorem |S]may still be useful for reducing 
the complexity of checking validity. 

Regarding compositional verification using weak assumption-guarantee for- 
mulas, Manna and Pnueli have proposed in p2l Page 337] a compositional ver- 
ification rule where the property of a module is exactly in the form of a weak 

n 

assumption-guarantee formula. Consider a system S that is equivalent to jj Mi. 

i—1 

Translated into our notation, the compositional verification rule reads as follows. 



Theorem 5 (Rule MOD-W). Suppose that S is a system equivalent to 
and A and G are canonical safety formulas. Then, 



n 



I! 



S^A 

Mj \= A > G for some j 
S\=G 
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As pointed out in [22], Rule MOD-W is normally applied in an incremental 
manner. One typically starts with A replaced by true and proves some property 
□Lfi of a module; one then uses DiJi in place of A to prove another property 
□i /2 of another module; and so on. 

The rule could have been formulated as: 

S ^ D-IIa 

Mj 1= uHa — > oHg for some j 
S 1= aHc 

The new rule seems to look simpler. However, to establish Mj dHa oHo, 
it is inevitable in practice to face the proof obligation of Mj ^ □( EIHa Hq)] 
this is due to the inability of a module to predict the future of its environment. 
Rule MOD-W makes this clearer. 

5 Examples 

We consider two examples. The examples are very simple and are intended to 
contrast the respective strengths of strong and weak assumption-guarantee spec- 
ifications, demonstrating their complementary roles in compositional verification 
(rather than their abilities in tackling large systems). In each example, a system 
is decomposed as the parallel composition of two modules and a property of the 
system is proven compositionally. We argue that Rule MOD-S is more effective 
for the first example, while Rule MOD-W is more effective for the second. 

5.1 Example 1 

Consider again Program keep- ahead that appeared in Section [21 It is easy to 
see that the values of a and b are monotonically (but not strictly) increasing in 
KEEP- AHEAD, i.e., KEEP- AHEAD ^ n((a > o“) A (6 > b~)). Can the property be 
verified compositionally? 

Theorem |T] suggests that we decompose □((« > a~) A (6 > b~)) as the con- 
junction of n(6 > 6“) > n(a > a“) and n(a > a“) > □(!> > 6“). Unfortunately, 
neither Ma [= n{b > b~) > n(o > a~) nor Mf, (= n(a > a~) > n{b > b~). For 
Module Ma, the assumption □(& > b~) says nothing about the initial value of 
b (permitting b to be an arbitrary negative integer initially), it therefore cannot 
guarantee that the value of a is monotonically increasing in the very first step 
(a is 0 initially and may become 6 -I- 1 in the first step). An analogy applies to 
Module Mb. 

A simple remedy is to first strengthen the proof obligation as keep- ahead \= 
a{{first ^ a > 0) A (a > a~) A {first ^ b > 0) A {b > b~)). Again, Theorem [T] 
suggests that we decompose 

n{{first —>■ a >0) A {a > a~) A {first — > 6 > 0) A (6 > b~)) 
as the conjunction of 

n{{first ^ b > 0) Ab > b~) \> u{{first ^ a > 0) A a > a~) 
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local X, y : integer where x = y — 0 



loop forever do 






loop forever do 






li : await x < y + 1 




li Py ■■■■ 




mi : await y < x + 1 






h '■ X := X + 1 








m 2 \y ~y + l 





Fig. 3. Program keep-up. 



and 

□ ((y?rst ^a>0)Aa>a”) > n{{first — > 5 > 0) A 6 > 6“). 

It turns out that Ma \= n{{first b>0)Ab> b~) > u{{first ^ a> 0)Aa > a“) 
and Mb [= u{{first ^ o > 0) A o > a~ ) \> u{{first ^ b>0) Ab> b~). Applying 
Rule MOD-S, we successfully prove keep- ahead |= u{{first ^ a > 0) A (a > 
a~ ) A {first — > 6 > 0) A (6 > 5“)) and hence keep- ahead ^ □((a > a“) A (5 > 
b~)) in a compositional way. 

It would be inconvenient, if not impossible, to achieve compositionality for 
the example using Rule MOD-W. With this rule, one somehow has to first es- 
tablish either keep- ahead |= n(a > a~) or keep- ahead ^ 0(6 > b~). To 
establish keep-AHEAD \= n(o > a~) first, for example, one may attempt to 
prove KEEP- ahead ^ n{b > b~) and Ma |= ^ n(a > a“). But, 

KEEP-AHEAD \= □(& > b~) then has to be established first, leading to a cycle. 

5.2 Example 2 

For the second example, we take Program keep-up from Chapter 4], which is 
recreated in Figure The program is decomposed as the composition of modules 
Mx and My as shown in Figure |4] The system specifications of keep-up, M^, 
and My are omitted for brevity. Note that in Mx the input variable y is explicitly 
given an initial value 0 and similarly in My the input variable x is given an initial 
value 0. 

It can be shown that keep-up |= nda; — y\ < I). In [^, the property 

□ (|a; — j/| < I) is proven compositionally by repeated applications of Rule MOD- 
W. Note that nda; — y| < 1) is equivalent to □((a; <y+l)A{y<x + 1)), or 

□ (a;<y+l)An(?/<a;-|-l). Briefly, the compositional verification proceeds as 
follows: 

1. Prove KEEP-UP ^ n(a; > a:“) by applying Rule MOD-W with A replaced by 
true and establishing the premise M^ \= true > n(a; > a:”); prove keep-up |= 
□ (y ^ y~) ip ap analogous way. These two proofs are independent of each 
other. 

2. Prove keep-up |= □(a; < y + 1) by applying Rule MOD-W with A replaced 
by n(y > y~), which was proven in the previous step, and establishing the 
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module Mx 

in y : integer where y = 0 

own out X : integer where a: = 0 

loop forever do 

li : await x < y + 1 
h '■ X ■.= X + 1 



module My 

in X : integer where a; = 0 

own out y : integer where y = 0 

loop forever do 

mi : await y < x + 1 
m 2 : y — y + 1 



Fig. 4. Program keep-up as the parallel composition of two modules. 



premise ^ □(?/ > 2/ ) > n(a; < y + 1); prove keep-up ^ □(?/ < a; -I- 1) 
in an analogous way. Again, these two proofs are independent of each other. 

An attempt to use Rule MOD-S would fail. The property u{x < y-l-l)An(?/ < 
a; -I- 1) indeed follows from the conjunction of n(a; < y + 1) > □(?/ < a; -I- 1) and 
n{y < a; -I- 1) > □(a; < y + 1) like in the first example. However, it is not 
possible to establish either |= n(a; < y + 1) > n{y < a; -I- 1) or |= 
n{y < a; -I- 1) > u{x < y + 1). This is due to the fact that both (a; < y -I- 1) 
and {y < x+1) are state formulas that may be falsified by a transition of some 
environment compatible with (the environment is allowed to change y in 
an arbitrary way); a module cannot possibly satisfy an assumption-guarantee 
property if the guarantee part can be falsified by its environment. Analogous 
arguments applies for My. 

6 Liveness 

To allow liveness properties, we simply strengthen an assumption-guarantee 
specification by conjoining it with the ordinary implication between the as- 
sumption and the guarantee. We consider the extension of a strong assumption- 
guarantee formula; the extension of a weak one can be done analogously. We will 
present just one inference rule for composing such properties. 

As the generalized definition of a strong assumption-guarantee formula with 
assumption A = oHaALa (in the canonical form) and guarantee G = nHa/\Lc, 
we define A > G as follows: 



A > G = {aHA > aHc) A (A ^ G) 

where the > on the right hand side is as defined in Section E] This generalized 
definition is consistent with the definition of > for safety assumptions and 
guarantees, since if A and G are safety formulas, the implication A — > G, i.e., 
uHa nHc, will be subsumed by uHa > nHo- 

Now that the assumptions may contain liveness properties, we no longer have 
symmetric composition rules like that in Theorem ^ as mutual dependency 
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on liveness properties leads to unsound rules. Theorem |2] was generalized in 
our previous work |16j to permit liveness in the guarantee parts. Below is an 
asymmetric composition rule for two modules; its proof can be found in the full 
paper [25| . 

Theorem 6. Suppose that Ai = UiHai, Gi = D^^Gi A Lqi, A 2 = A La 2 , 

G 2 = dHg 2 a Lg 2 , a = dHa a La, and G = nHc A Lq, all in the canonical 
form. Then, 

1. (a) h n(^BHA A b{Hg, a HG 2 ) Ha^ A Ha^ 

(6) h ^ A Gi ^ A 2 

2. (a) h □(©Ei^AA E(i/Gi Ai/c2) ^^^g) 

(5) h >1 A Gi A G 2 ^ G 

h (^1 > Gi) A (A 2 > G 2 ) ^ (Al > G) 

Again, take A to be true for a closed system. 

This composition rule looks unsophisticated and may seem not to be very 
useful. As a matter of fact, many practical systems exhibit the type of depen- 
dency treated by the rule. Take network protocols as an example. An upper-layer 
protocol relies on the liveness properties of a lower-layer one to ensure liveness in 
the service that it provides, but the lower-layer protocol does not assume liveness 
about the upper-layer one. We believe that two modules with more complicated 
dependency on liveness should be verified as one single module. 

7 Discussion: Guidelines of Usage 

We give a few guidelines for using the proposed compositional approach. 

— What type of systems can be treated with the compositional approach? 

Our approach works for a system where each shared variable is owned and 
can be modified by exactly one of its modules. This is partly due to the fact 
that only this type of systems allow parallel composition to be conveniently 
modeled as conjunction in LTL. There certainly are ways to circumvent this 
limitation; introducing the notion of agents is one possibility [1]. However, 
we do not think that compositional verification should be applied to two 
modules that may change a same shared variable. Sharing variables in such 
a manner indicates that the two modules are tightly coupled and are best 
treated as one single module. 

~ How does one decide which form of assumption- guarantee specification should 
he used? 

The desired property of a system gives much hint on what the guarantee 
parts should look like, as seen from the examples in Section E] Here is the 
first thing to check: does the guarantee part involve a variable owned by the 
environment? For instance, in Example 1 the guarantee part in each of the 
two assumption-guarantee properties does not involve a variable owned by 
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the environment. A module M cannot possibly satisfy a property A \> G 
if some environment compatible with M is capable of falsifying G, which is 
more likely to happen when G involves a variable owned by the environment. 
If the environment is capable of falsifying G, then one may try to find a 
suitable A' such that in falsifying G the environment also has to pay the 
price of falsifying A' . A' > G could turn out to be a property of M useful 
for proving the desired property of the system. 

— What changes are needed to the approach if one prefers using “+”-super- 
scribed (or primed) rather than “—’’-superscribed variables in expressing 
transitions of a system (like in TLA ) ? 

We have opted for using “—’’-superscribed variables, as it leads to a more 
succinct formulation of assumption-guarantee specifications and rules for 
composing such specifications. The required changes are quite straightfor- 
ward. If A = InitA A uNa and G = Initc A nlVc, where Na and Nq are 
transition formulas using primed variables, then A \> G translates into 

Inito A {InitA ITc) A □( B{{first InitA) A Na) ONq)- 

Note that □( B{{first InitA) A Na) ONq) is a safety formula, though 
not in the canonical form. The composition rules can be changed accordingly. 

We have omitted the treatment of hiding, i.e., assumptions and guarantees 
with existentially quantified variables, to focus on showing the complementary 
roles of strong and weak assumption-guarantee specifications. Hiding is a pow- 
erful means of expressiveness. The formulation of strong assumption-guarantee 
specifications with hiding have been considered in our previous work | 16| : the 
same technique applies to weak assumption-guarantee specifications. 
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Abstract. Refinement calculi for imperative programs provide an in- 
tegrated framework for programs and specifications and allow one to 
develop programs from specifications in a systematic fashion. The seman- 
tics of these calculi has traditionally been defined in terms of predicate 
transformers and poses several challenges in defining a state transformer 
semantics in the denotational style. We define a novel semantics in terms 
of sets of state transformers and prove it to be isomorphic to positively 
multiplicative predicate transformers. This semantics disagrees with the 
traditional semantics in some places and the consequences of the dis- 
agreement are analyzed. 



1 Introduction 

Two dominant semantic views of imperative programs are in terms of state 
transformers, initiated by McCarthy m, Scott and Strachey m, and pred- 
icate transformers, initiated by Dijkstra m- State transformers give a clear 
correspondence with the operational semantics, where commands do, after all, 
transform the state of a machine. The predicate transformer view, on the other 
hand, has been argued to be suitable for showing that programs achieve certain 
goals, i.e., to questions of correctness. A definitive relationship between the two 
views was established by Plotkin [2Hj , following other work [SlEIlIl!, where it is 
shown that Dijkstra’s predicate transformers are isomorphic to nondeterminis- 
tic state transformers defined using the Smyth powerdomain. The isomorphism 
establishes a tight connection between the predicate transformer view and opera- 
tional behavior, which is not obvious otherwise. It is also of important conceptual 
value as it allows the two semantic views to coexist side by side. The ideas ex- 
pressed using either view can be converted into the other, and there is no conflict 
between the two views. 

In more recent work, predicate transformers have been put to new uses. Re- 
finement calculi, developed by Hehner m, Back Morris m, Morgan m 
and Nelson extend Dijkstra’s programming language with “specification 
statements.” Typically written as [ip, if], a specification statement stands for 
some statement that is yet to be developed but which is expected to satisfy 
the specification {ip, if), i.e., transform states satisfying p to states satisfying if. 

J. Tiuryn (Ed.): FOSSACS 2000, LNCS 1784, pp. 359- fT7H 2000. 

(c) Springer- Verlag Berlin Heidelberg 2000 
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Such specification statements serve as space fillers in the initial stages of pro- 
gram development, and they are refined to actual program statements in later 
stages. 

The semantics of such extended languages for program refinement has only 
been defined in terms of predicate transformers. No semantics is known in terms 
of state transformers. Moreover, the predicate transformers involved in the se- 
mantics go beyond Dijkstra’s predicate transformers. (They do not satisfy Di- 
jkstra’s healthiness conditions such as continuity.) Since, by Plotkin’s result, 
state transformers are isomorphic to Dijkstra’s predicate transformers, we al- 
ready know that there are no conventional state transformers corresponding to 
these new predicate transformers. This leaves the operational interpretation of 
the refinement calculi very much in the dark. 

In this paper, we develop a semantic interpretation of refinement calculi in 
terms of state transformers. The basic idea is that statements in refinement 
calculi are to be interpreted as sets of state transformers that satisfy the speci- 
fications embedded in the statements. In Denney’s terminology m, this inter- 
pretation represents “under-determinism” as opposed to “nondeterminism.” We 
also need a notion of guarded state transformers, similar to the idea of partial 
functions, which are defined only for some subset of the set of all states. We are 
able to show that suitable sets of guarded state transformers are isomorphic to 
positively multiplicative predicate transformers. This parallels Plotkin’s original 
isomorphism result for Dijkstra’s predicate transformers. 

All the constructs of refinement calculi can be interpreted using sets of 
guarded state transformers. This gives a natural semantics of specification state- 
ments as collections of program statements that meet those specifications. How- 
ever, this semantics does not match up exactly with the traditional predicate 
transformer semantics of refinement calculi. The predicate transformers used in 
the latter are not in general positively multiplicative, a property used in our 
isomorphism result. 

We examine the consequences of this mismatch, and show that there are 
refinement laws that are intuitively unreasonable but hold in the traditional 
semantics though not in ours. The conclusion is that a better semantics of re- 
finement calculus is obtained by restricting to positively multiplicative predicate 
transformers which have a natural equivalence with state transformer sets. 

We believe these results go a long way towards demystifying refinement cal- 
culi. The absence of an operational reading for the constructs of refinement 
calculi has contributed to some of the mysteries surrounding the traditional 
treatment of the subject. The predicate transformer semantics implies that the 
theory of these calculi is internally consistent. However, the mysteries point to 
problems in interpreting the theory. Our contribution is in clarifying the in- 
terpretation which, we hope, might lead to a wider appreciation of the theory 
itself. 

Related Work The early work on relating state transformers and predicate trans- 
formers is mentioned in Plotkin |28] . In later work. Apt and Plotkin mm 
extended |2H] to countable nondeterminism, and Smyth 122] to non-flat state 
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spaces. Bonsangue and Kok |7] found correspondences for safety and liveness 
predicate transformers and, in [S], for Nelson’s predicate transformers |27J . We 
should remark that all this work is for programming languages, not for specifica- 
tion languages used in refinement. However, there are close relationships between 
the results needed in this paper and the earlier results, especially those of Apt 
and Plotkin BE]. Morgan |^, Gardiner [2] and Naumann [25| also consid- 
ered multiplicative predicate transformers and the correspondence with relations 
(which may be seen as infinitely nondeterministic state transformers) . Gardiner 
et al. 12] and Naumann [2] used this correspondence to lift type structure to 
specification languages. 

After the present work was completed, we were made aware of Ewen Den- 
ney’s dissertation m, which echoes very similar ideas to our work. In particular, 
it interprets specifications via under-determinism. On the other hand, Denney 
focuses on functional programming languages whereas we are looking at im- 
perative programming and the correspondence between state transformer and 
predicate transformer interpretations. We also highlight the interaction between 
nondeterminism and under-determinism (cf. Sec. 13.21) . 

Overview In Sec. El we give a brief summary of the refinement calculus we use in 
this paper and define its predicate transformer semantics. Section El introduces 
the state transformer concepts that are used in our semantics and show their 
isomorphism with positively multiplicative predicate transformers. The state 
transformer semantics of the calculus is defined in Sec. m Finally, in Sec. El we 
discuss problems and issues that lie outside our isomorphism. 

2 Refinement Calculns 

Refinement calculi are obtained by extending a programming language with ad- 
ditional notations for expressing specifications. Program statements and specifi- 
cation statements are then freely intermixed. A refinement relation C is defined 
between statements in the extended language. A collection of refinement laws, 
axiomatizing the refinement relation, is devised, by which a specification can be 
refined to an executable program in a series of steps. The subject is extensively 
covered in the two text books Bim as well as the collection [23| . 

Here, we use a variant of the Morgan-Gardiner refinement calculus [21 EH 
as the basis of our study. For simplicity, we treat basic imperative programs over 
a fixed collection of program variables. However, we will allow locally bound 
constant identifiers in specifications. 

Assume a finite set V of typed variable identifiers, and a countably infinite 
set 2 of constant identifiers, disjoint from V. Using these we form a collection of 
expressions, assertions and atomic commands, whose structure we unspecified 
except to note that both variable identifiers and constant identifiers can occur 
in them. The collection of statements in the Dijkstra’s programming language is 
given by the context-free syntax: 

C ::= A I skip | abort | C\-,C 2 \ ifGfi | do God 
G ::= e I A ^ G I Gi 0 Gs 
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where A and E range over atomic commands and boolean expressions respec- 
tively, and G stands for guarded commands. 

To obtain a refinement calculus, we extend the collection of statements by 
two clauses: 



C ::= . . . I ui, . . . , [^5, V’] I con i: T = if in C 

The statement ui, ..., :[(/?, ^/>] is called a specification statement or a prescrip- 

tion. The intended meaning is that it stands for some arbitrary program state- 
ment that satisfies the specification i.e., transforms states satisfying (p 

to those satisfying if, by modifying at most the variables ui, . . . ,u„. The vari- 
ables vi,...,Vn are said to constitute the frame of the statement. When the 
frame includes all the variables in V, we use the abbreviation \ip, if] for V: \<p, if]. 
For example, the statement r: [n > 0, — 1 < n < r^] specifies the action of 

assigning to r the integer square root of n. 

The construct con i:r = E in C specifies an action that satisfies the speci- 
fication C when i is given the value of E in the current state. For example, 

con k: int = |n| in n: [true, jnj = fc -|- 1] 

specifies that n must be modified so as to increase its absolute value by 1. This 
is a variant of the constant-introduction construct of Morgan and Gardiner m 
where we require the initial value to be explicitly declared. We consider the 
Morgan-Gardiner construct in Section El as it raises interesting semantic issues. 

Predicate Transformer Semantics 

A predicate transformer interpretation for the refinement calculus has been de- 
fined by Morgan and Gardiner [l9l[l2 (as well as other authors on the subject). 
Here, we use a semantic version of this interpretation by taking predicates as 
sets of states. Our treatment closely follows Plotkin [^. See also |8] [7] for 
similar presentations. 

Let E be the set of states for the variables in V. For technical reasons, 
we assume that E is countable. A predicate is a subset a C E. A predicate 
transformer is a monotone function t : V{E) — > V{E). Predicate transformers are 
partially ordered by the pointwise orderingQti C t 2 Va' G V{E).ti{a') C 
t^ia'). The poset of predicate transformers is denoted PT. 

A predicate transformer is said to be completely multiplicative if for any 
family F' C V{E), t(p|F') = fjt(F'). We call it positively multiplicative if this 
property holds for all nonempty families F' C V{E). Define the poset: 

PTM’'" = {t : V{E) 'P(E) ] t is positively multiplicative }, ordered pointwise 

If t is any predicate transformer and x € t(E), define 

^t(x) = P|{a' I X € t(a')}. 

^ We often use primed variable names (such as a') as arguments for predicate trans- 
formers to denote the fact that they are sets of post-states. 
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Program Operations 



Skip 

Conv 

Comp 

Cond 



V 


Do 


Bool X T> —> 


V 


d-ST V 


Empty 


Bool X V 




V'^ 


Guard 


Bool X V ^ 


Bool X T> 


Bool X 


Bar 


(Bool X Vf 


Bool X T> 



Specification Operations 

Presfl : V{Ef V 
ConA : {E ^ A)x {A^V) 



where i? C 17 x X" is an equivalence relation 
A is a countable set 
d-ST = E —> E, ordered discretely 
Bool — E —> {tt, ff}, ordered discretely 



Table 1. Signature of the Semantic Algebra 



Lemma 1. A predicate transformer t is positively multiplicative iff, for all x C 
t{E), X G t{Lt{x)). In this case, Lt{x) is the least a' such that x G t{a'). 



Lemma 2. For any predicate transformer t, there is a least positively multiplica- 
tive predicate transformer t* above t, given by t*{a') = {x G t{E) \ Lffx) C a'}. 

Note that Lf{x) and Lt{x) are the same. By forcing t*{Lt*{x)) to include x, we 
obtain a positively multiplicative predicate transformer. We call t* the positively 
multiplicative closure of t. 

Lemma 3. PTM’'" is a complete lattice with least upper bounds given by 
a))*. The least element J_pjM+ is Aa'.0. 

Note that PTM''" is not a complete sublattice of PT because the least upper 
bounds in PTM^ are different from those in PT. 

We work in the category of complete lattices with monotone functions as 
morphisms. By Tarski’s fixed point theorem, every monotone function / : L — > L 
has a least fixed point, given by fix(/) = |~|{t G L \ f{t) C t}. 

To define the semantics of the refinement calculus, we use an algebraic ap- 
proach as in [2H]. Table [H shows the signature of a semantic algebra T>, where all 
the operations are meant to be monotone maps. For the predicate transformer 
semantics V — PTM^, but the same signature will be used for the state trans- 
former semantics to be introduced later. The program operations are as in |2S] 
and we recall their definitions in Table [21 The only difference from [2E1 is that we 
are using positively multiplicative predicate transformers instead of continuous 



ones. 
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Skip = Xa'.a' 

Conv(m) = Xa'.m~^{a') 

Comp(ii,t2) = t\ot2 

Cond(p,t) = Xa' .p^ r\t[a') 

Do{p,t) — fixpj|v|+ (At'. Aa'. (p“ n a') U (p"*" n (t o t')(a'))) 

Empty = (A®. fF, Aa'. 0 ) 

Guard(p,t) = {p,t) 

Bar((pi,ti), (p2,t2)) = (piVp2, Aa'. Up^) n (pC Uti(o')) n (P2 Ut2(a'))) 



where p"*" = p ^(tt), p —p ^ (ff) and {p\/ q){x) = p{x) V q{x) 

Table 2. Program Operations for PTM^ 



For interpreting specification constructs, we define two new operators: 

1. Prescription: The operation Presp captures the semantics of Morgan’s 
specification statement v: [<p, ijj]. The idea that only variables v can be mod- 
ified can be represented by an equivalence relation i? C E x S, which equates 
states that possibly differ only in variables v. We write for the equiva- 
lence class of X under R. Define a family of operations indexed by equivalence 
relations R C E x E: 

PresR : V{E) x V{E) PTM+ 

Presp(6,6') = Xa' .{x S 6 | 6' 0 [a;]^ C a'} 

2. Constant introduction: The family of operations Con^ : {E ^ A) x {A ^ 
PTM^) ^ PTM^ captures the introduction of constant identifiers of type 

yl. 

CoriA(e, /) = Xa'.{x € E \ X G f{e{x)){a')} 

We note that two of Dijkstra’s healthiness conditions are violated by these pred- 
icate transformers. When R relates all pairs of states, 

— Presj^(6, 0) is not strict (unless b = 0). 

— Presi^(6, E) is not continuous. Note that E can be expressed as a lub IJ, a' 
for an increasing sequence of finite sets o'. PresR(6, T’)(A7) = b, but for every 
finite o', PresR(6,A:)(a') = 0. 



Lemma 4. All the operators above are well-defined and monotone. 

Note that the operators are not necessarily continuous. For example, Comp is 
not continuous. 

The semantics of the refinement language is as follows. Since commands 
have free identifiers (for constants), we use environments for giving values to the 
identifiers [13 ng. Env denotes the set of environments. The semantic functions 
are defined in Table |3] parameterized by a semantic algebra T>. By instantiating 
the definition hy T> = PTM’'", we obtain the predicate transformer semantics. 
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V 

S 

C 

9 



Predicates ^ Env ^ Vi^E) 

Expressions ^ Env ^ {E ^ Value) 
Atomic commands ^ Env ^ d-ST 
Statements ^ Env ^ T> 

Guarded Commands ^ Env Bool x T> 



Interpretation of Commands 

C[Ale 
C|skip]e 
C [abort jje 
C[Ci;C2le 
Cpf Gfije 
CldoGodje 
Civ: [tfi,ip]]e 
C[con i:r = E in G]e 



Conv(A[A]e) 

Skip 

_L 

Comp(C[Gi]e,C[G 2 ]e) 

Cond(C/[G]e) 

Do(eiGle) 

PreSfl(„)(P[v9]e,P[i/>]e) 

Con[r](f[-Ele, Afc G [r].C[G]e[i k]) 



where R(v) denotes the equivalence relation on states given by 
x\R{v)]x Vu ^ V. x(v) = x'(v) 

Interpretation of Guarded Commands 

9{e\e = Empty 

g\E->c\e = Guard{£lE}e,ClCje) 
g[Gi[]G2le = Bar(e[Gile, ei[G2]e) 



Table 3. Semantics of Refinement Calculus 



We denote these semantic functions by Cm and ^m for commands and guarded 
commands respectively. 

The fact that all the semantic algebra operations are monotonic implies that 
program contexts preserve refinement, i.e., C E C' implies P{C} E P{C'} 
for any program context P{ }. This result is essential for program refinement 
because it allows one to refine whole programs by refining their components one 
at a time. 



3 State Transformers and Predicate Transformers 

Consider the set of states E. The set obtained by adding an element _L (for the 
undefined state) is denoted E± . We make E± into a poset by defining the partial 
order x E y a: = _L V a; = y. The Smyth powerdomain of E±^ is defined as 

follows: 

= the set of nonempty finite subsets of E and the infinite set E±, 
ordered by superset order. 

So, the least element of Vs{E±) is E±. 
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The domain of state transformers is 

ST = (T' ^ Vs{S±)), ordered pointwise 
The intuition is as follows. If c C c', then 

1. c' terminates (possibly) more often than c, and 

2. c' is (possibly) more deterministic than c. 

We say that c' is “better” than c. Say that a state transformer c satisfies a 
specification (a, a'), written c |= (a, a'), if running c from a state in a gives a 
state in a' . Formally, 



c 1= (a, a) \/x G a. c{x) C a 

Then, it is easy to see that c \= (a, o') A c C c' c! ^ (a, a'). That is, better 
state transformers continue to satisfy all the old specifications. 

By regarding a predicate transformer f as a collection of specifications 
'GP(s)^ we have a notion of satisfaction for predicate transformers: 

c\= t <;=^ \/x.\/a'.x G t{a!) =k c(x) C a' 

The strongest predicate transformer satisfied by c is denoted Tc: 

Tc(a') = {x G S \ c{x) C a} 

Tc is nothing but the “weakest precondition” operator of c. It satisfies the fol- 
lowing properties: 

— continuity: Tc(ljja') = Ui^c(a') for every ascending chain {a'}i. The 
reason is that Tc{[J^a'^) includes all and only those initial states x whose 
results c{x) are included in finite subsets of [J ■ o'. 

— positive multiplicativity: for nonempty X. 

The reason is that x is in Tc{ai) only when for all i, c(x) is a subset of 
Oi. This is equivalent to a: G afi. 

— strictness: Tc{%) = 0. The reason is that c(a;) is always nonempty. So, 
c{x) C 0 is impossible. 

It is possible to recover c from T c. For any predicate transformer t that satisfies 
these properties, letH = Xx.x G t{S) Lt{x)-,S^. It can be verified 

that is a state transformer. 

Theorem 1 (Plotkin). There is an order-isomorphism between ST and the 
poset of predicate transformers that are continuous, positively multiplicative and 
strict. 

Recall that the predicate transformers used in refinement calculus do not gener- 
ally satisfy the properties mentioned above. We examine a series of state trans- 
former concepts that correspond to wider classes of predicate transformers. 

We use the notation p x\y to mean “if p then x else t/.” 



2 
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3.1 Guarded State Transformers 

The idea of a guarded state transformer is similar to that of a partial function. 
A guarded state transformer is meant to be run only starting from certain initial 
states and not from others. Formally, a guarded state transformer is a pair 

(p C A,c : p ^ VsiS^i)) 

Note that c is only defined for states in p (which is called the “domain of def- 
inition”) and undefined for others. This notion of “undefined” is different from 
nontermination. (The state transformer c might still map states in p to S±.) A 
guarded state transformer is simply never meant to be used outside its domain 
of definition. The notion of satisfaction is: 

(p, c) 1= (a, a) <1=^ Va; G p. a; G a => c(a:) C a 

So, we only worry about initial states within the domain of definition. As a re- 
sult, the completely undefined state transformer satisfies every specification. In 
particular, (0, Aa:.Aj_) |= (A, 0). Recall that there are no ordinary state trans- 
formers satisfying (A, 0). But this is not the case for guarded state transformers. 
In refinement calculus literature, this (sneaky!) way of satisfying specifications 
is termed “miraculous” m- 

We define a partial order on guarded state transformers by 

(p, c) C (p', c') P A p' A {\/x £ p . c{x) C c (x)). 

This partial order may seem surprising. We get a better state transformer by 
reducing the domain of definition. However, this order is consistent with the 
notion of satisfaction: 

(p,c) C (p',c') A (p,c) h {a, a’) ^ {a, a') 

Just as partial functions A ^ B can be regarded as total functions of type 
A — > with an adjoined A element denoting the undefined result, guarded 

state transformers can be regarded as state transformers with an adjoined top 
element in the codomain: A ^ Vg{S±). Here Vg (S±) is like the Smyth pow- 
erdomain but also includes the empty set 0 (which serves as the top element 
under the superset order) . A guarded state transformer (p, c) is represented un- 
der this representation as the function Xx G E.x £ p c(a;);0. Conversely, a 
state transformer d : A — > 7^J(Aj_) represents the guarded state transformer 
(dom(d), d \ dom(d)) where dom(d) = d~^{%). From here on, we will identify 
guarded state transformers with this alternative representation, which is techni- 
cally convenient to work with. 

Define GST as the poset: 

GST = A^'Pj(Aj_), ordered pointwise 

For every guarded state transformer d £ GST, we define a predicate transformer 
Td : P(A) ^ P(A) by 



Td{a') = {a; G A I d{x) C o'} 
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This predicate transformer is continuous and positively multiplicative for the 
same reasons as before. But it is not strict. We have Td{%) = {x S \ d{x) = 
0} = dom(d), which has no reason to be empty. There is an inverse to T: 

T~^{t) = Xx. X G t{S) Lt{x); S± 



Theorem 2. There is an order-isomorphism between GST and the poset of pred- 
icate transformers that are continuous and positively multiplicative. 

3.2 State Transformer Sets 

Given a specification we have a collection S of state transformers sat- 

isfying it. Any such collection is closed under union in the following sense: if 
c is a state transformer and, for every x G S, there are ci,...,c„ G S such 
that c{x) C ci(x) U . . . U Cn(x), then c G S. There is a simpler statement of 
this. Let the “lower bound” map S : E ^ V{E±) be the pointwise union 
S{x) = U{c(a^) I c G S'} (which is not a state transformer). Closure under 
union says that any state transformer c such that c(x) C S{x) is in S. The same 
idea can also be used for guarded state transformers. In this case, the collection 
S must be nonempty. If S is a nonempty set of guarded state transformers, we 
define its closure under union by S^ = {c | Vx. c{x) C S(x)|. 

Remark 1. The lower bound maps S can be regarded as maps of type E 
V^{E±) where is the infinitely nondeterministic Smyth powerdomain [^, 
whose elements include E±_ and all subsets of E. Sets of state transformers closed 
under union are one-to-one with such infinitely nondeterministic maps. This is 
in fact an order-isomorphism. 

Let PGST denote the poset with nonempty sets of guarded state transformers 
that are closed under union, ordered by superset order. We call the elements of 
PGST state transformer sets. 

Lemma 5. PGST is a complete lattice with the least upper bounds given by 
intersection: \_\^ Si = 

For any S G PGST, we define a predicate transformer TS : V{E) 'P(L') 
by 

TS{a') = Pi Tc(a') = {x G A | Vc G S'. c(x) C a'} 

ceS 

This predicate transformer is positively multiplicative:TS{f^^^j- ai ) = n.exT’5(a.) 
for nonempty X. But, it is not continuous. We have TS(lJjai) = {x G A | 
Vc G S. c(x) C IJ. at}. If S is the set of all terminating state transformers, then 
TS{E) = E, but TS{a) = 0 for every finite a C E. 

Conversely, every positively multiplicative predicate transformer corresponds 
to a state transformer set: T~^{t) = {c | c |= tj. 

Theorem 3. There is an order-isomorphism between PGST and PTM^. 
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Skip = 
Conv(m) = 
Comp(Si,S' 2 ) = 
Cond(p, S) = 
Do(p,S) = 
Empty = 
Guard(p, S) = 
Bar((pi,S'i), = 
(P2, S2)) 



where App : GST x 



{Xx. {x}}t 
{A*. {m(a;)}}^ 

{Aa:. App(c2,ci(a;)) | ci £ ^i, C2 £ 82}^ 

{Xx.p{x) c(a;); S± \ c £ 

flxpGsxAS''. {Aa:.p(a;) App(c', c(®)); {x} | c £ S', c' £ S'}^ 
(Aa;.ff, {A®, 

(p,s) 

(Pi Vp2, 

{Aa:.pi(a;) (p2(a;) ci(a;) U C2{x)-, ci(a;)); {p2{x) C2(x); Si_) 

I Cl £ Si, C2 £ Sa}^) 

Vj {Si_) V] (Sj_) is defined by 
App(c,a') = {a' = E2_)'^ E^-,\J^,^^,c[x') 



Table 4. Program Operations for PGST 



4 State Transformer Semantics 

We define a semantics of the refinement calculus using state transformer sets 
introduced in the previous section. We proceed as in Sec El by defining a semantic 
algebra over PGST. The operations for program statements are lifted versions 
of Plotkin’s operations in [2B] • They are shown in Table lH The operations for 
specification statements are as follows: 

1. Prescription: For any equivalence relation R C E x E, 

Presfl : V{E) x V{E) PGST 
Presp(5, 6') = {c £ GST | Vcc £ 5. c(x) C 6' n [x]p} 

This defines our under-determinism semantics for specification statements. 
A specification statement stands for an arbitrary command that satisfies the 
specification. 

2. Constant introdnction: 

Gon^ : {E ^ A)x{A^ PGST) ^ PGST 
Gon^(e, /) = {c £ GST I Va; £ A. 3c' £ /(e(x)). c(x) = c'(a;)} 

This looks a bit intricate, but it is easier to see in terms of lower bound 
maps: (Gon^le, /))(a;) = /(e(a;))(a;). 

Lemma 6. The order-isomorphism T : PGST = PTM^ is an isomorphism of 
the semantic algebra. 

The semantic equations in Table S] now give a state transformer semantics for 
the refinement calculus. We denote these semantic functions by Cs and ^s- 
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Theorem 4. The isomorphism T : PGST = PTM^ is an isomorphism of the 
semantics of refinement calculus in the sense that the following diagrams com- 
mute: 



Statements x Env 



Guarded Commands x Env 





BoolxPTM+ 



5 Beyond the Isomorphism 

In the last section, we focused on giving a state transformer semantics to a refine- 
ment calculus in such a way that it matches the traditional predicate transformer 
semantics. The benefit of this exercise is that it gives an intuitive support for 
the traditional approach. However, we believe this semantics is not ideal. The 
state transformer set approach gives us a better handle on specifications which 
does not seem possible in the predicate transformer approach. In this section, 
we explore the new opportunities. 

Consider the semantics of a do statement of the form do B ^ v: [pjf:] od. 
The intent is that the specification v: [pj'ip] will eventually be refined to a con- 
crete program statement which will then be repeated during execution. If there 
are several possible refinements, one of them must be chosen before the execu- 
tion ever begins. In contrast, the predicate transformer semantics as well as our 
matching state transformer semantics allow the statement of the loop body to 
be chosen each time the loop is repeated. In other words, they represent non- 
determinism instead of under-determinism. To arrive at a better semantics, we 
redefine the Do operator as follows: 

Do : Bool X PGST ^ PGST 

Do(p, S') = {Dogst(p, c) |c G S}'!' 

Dogst(p, c) = flxGsxAci. Ax.p(a:) App(d, c(a;));{a;} 

In this under-determinism semantics, a fixed command is chosen for the loop 
body which is then repeated during execution. It does not seem possible to 
express such an interpretation in the predicate transformer setting. 

Morgan’s refinement calculus contains a general constant-introduction oper- 
ator of the form con i:r.C{i), where there is no initialization of the constant 
identifier. This operator is termed “conjunction,” and its meaning is explained 
as the worst program that is better than every C(i). In other words, it is the 
least upper bound of all C{i)’s. Formally, the interpretation is 



C|con i: T. C\e = UfceH ClCje[i k] 




On the Semantics of Refinement Calculi 



371 



Since, in PGST, least upper bounds are given by intersections, we obtain 

Cs{conr.T.C\e = f]kGir}CslCje[i ^ k] 

which says that a state transformer satisfying con i:T.C{i) must satisfy C{i) 
for every value of i. The semantics in PTM^ amounts to: 

CM[coni:r.C]e = (Aa'. UfeGH ^M[C]e[i ^ fc](a'))* 

Given that PGST and PTM^ are order-isomorphic, these two interpretation 
match up in the sense of Theorem S) 

However, the traditional semantics [14] is given in PT where all monotone 
predicate transformers are present and least upper bounds are given pointwise. 
So, the interpretation of con amounts to 

Cp|con f: r. C]e = Aa'. Cp|C]e[f ^ /c](a') 

where the subscript P identifies the semantics in PT. This predicate transformer 
is not positively multiplicative even if every Cp|C']e[f ^ k] is positively multi- 
plicative. 

What are the consequences of this mismatch? Since positively multiplica- 
tive predicate transformers form a proper subset of predicate transformers, our 
semantics identifies statements which would be semantically distinct in the tra- 
ditional semantics. The following is an example. For convenience, we use a binary 
conjunction operator Ci A C 2 , which can be regarded as a special case of the 
general one, for example as (con i: bool. i ^ ^ C 2 fi). Consider the 

two statements: 

C = [true, n > 0] A [true, n < 0] and C = [true, n = 0[ 

The collection of state transformers satisfying the two specifications is exactly the 
same. It is {\x. {0}}^. (We are taking states to be the values of the variable n.) 
Hence, C = C = {n := 0) in our semantics. However, the traditional semantics 
interprets the two statements as the respective predicate transformers 

t{a') = ((a' D Z+U {0}) V (o' D Z-U {0})) I?;0 

t'\a') = a'D{O}-^r;0 

which are clearly distinct. Whereas t' is equivalent to n := 0, f is not equivalent 
to any program statement. Nevertheless, n := 0 is the only nontrivial statement 
that C can be refined to. These distinctions have nontrivial consequences under 
sequential composition. Consider 

Z? = C; [n = 0, n = 9] and D' = C'; [n = 0, n = 9] . 

The traditional semantics equates D to abort, whereas D' is equivalent to n := 
9. The equivalence D = abort is surprising. We are hard put to find any intuitive 
explanation of why D should be equivalent to abort. 
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To pin down the difference between the traditional semantics and ours, we 
consider the following (hypothetical) A-distributivity law: 

{Cl^C2)■,S E (Ci;5)A(C2;5) 

To us, this law seems unreasonable. Basically, it says that the requirements for 
a composite command (Ci; S) A (C 2 ; S) entail requirements for the component 
commands (Ci A (72). However, the law is validated by Morgan’s semantics and 
the fact D C abort can be derived using it. This law is not valid in our semantics. 

6 Conclusion 

Refinement calculi have been proposed as integrated frameworks for combining 
programs and specifications and as vehicles for deriving programs from speci- 
fications. But their traditional semantics, defined in the predicate transformer 
setting, leaves several questions unanswered. The most important of these is 
what specification statements mean in terms of one’s operational intuitions. By 
giving a semantics in terms of sets of state transformers, we hope to have an- 
swered these questions. We showed that the mysterious concept of “miracle” has 
a natural explanation in terms of partially defined state transformers. We also 
proposed that the non-multiplicative predicate transformers used in the tradi- 
tional semantics may not be ideal, whereas a semantics based on positively mul- 
tiplicative predicate transformers has a natural correspondence with the state 
transformer semantics. 

We leave open the question of what it means for a semantics to be ideal. 
For programming languages, the ideal semantics is often taken to be a fully 
abstract semantics, i.e., one whose equality relation is the same as observational 
equivalence. For specification languages, it is not yet clear what observational 
equivalence might mean. 

We have considered a very simple language here to focus on the main ideas. 
The extension of the ideas to cover procedures, abstract data types and object- 
oriented concepts remains to be addressed. 
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Abstract. The ambient calculus was designed to model mobile pro- 
cesses and study their properties. A first type system was proposed by 
Cardelli-Gordon-Ghelli to prevent run-time faults. We extend it by intro- 
ducing subtyping and present a type-checking algorithm which returns 
a minimal type relatively to this system. By the way, we also add two 
new constructs to the language. Finally, we remove the type annotations 
from the syntax and give a type- inference algorithm for the original type 
system. 



1 Introduction 

With the growing development of the World- Wide- Web, it becomes interesting 
and fruitful to investigate the problems and properties of mobile code. The am- 
bient calculus was designed to model within a single framework both mobile 
computing, that is to say computation in mobile devices like a laptop, and mo- 
bile computation, that is to say mobile code moving between different devices, 
like applets or agents. It also shows how the notions of administrative domains, 
their crossing, firewalls, authorizations... can be formalized in a calculus. In this 
sense, it is more appropriate than the 7r-calculus ( |Mil91| l. even if the bases 
are the same (for more discussion about the problems raised by mobility and 
computation over wide-area networks, see |Car99al ICar99b| l. 

Informally, an ambient is a bounded place, with an inside and an outside, 
where computation happens. Many ambients can be nested so that they form a 
hierarchy. Each of them has a name (not necessarily distinct from other ambient 
names), which will be used to control access. An ambient can be moved as a 
whole with all the computations and subambients it contains: it can enter another 
ambient or exit it. It can also be opened so that its contents get visible at the 
current level. For more intuitions motivating the ambient calculus or its graphical 
vision (the folder calculus), we recommend reading [CCi^[?icM]lCa,r99a,j . 

In order to prove some specific properties concerning mobility, locking, com- 
munication... in the ambient calculus, Cardelli and Gordon proposed a simple 
type system, nondeterministic and without sub typing (see |CG99I ICGG99j l: 
some simple valid processes like (1) | {x : Real).P were not typable. The aim of 
our work was to introduce a subtyping relation, deduce a typing algorithm and, 
in doing this, to make the type system more suitable for a treatment of mobile 

J. Tiuryn (Ed.): FOSSACS 2000, LNCS 1784, pp. 375- fHgUl 2000. 

(c) Springer- Verlag Berlin Heidelberg 2000 
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ambients as a “programming language for mobility” (departing thus from the 
“ambients as a specific language for reasoning on mobility” approach). 

The rest of the paper is organized as follows. In Section [7] we will review 
briefly the ambient calculus and its semantics, by the way adding two new con- 
structs to the original syntax. Then, in Section [21 we present our extension of 
the type system, introduce a subtyping ordering, and define a new typing sys- 
tem. In Section [4[ we show that the set of derivable types for a term has got a 
minimum, and we give an algorithm to compute it efficiently. It is now possible 
to type-check a term of the ambient calculus automatically. The next step in 
Section [S] is to remove all type annotations from the term and try to find a type- 
inference algorithm. We give a complete solution for the original type system 
without subtyping. 

This paper is a shortened version of an internship report |Zim99| , which con- 
tains more explanations, further developments and all the proofs of the theorems 
enunciated in this paper. 

2 Mobile Ambients: Syntax and Semantics 

In this Section, we are going to briefly review the polyadic ambient calculus we 
will use throughout this paper. We will try to explain its main constructs and 
rules, but we recommend reading |CG98| (or any other paper presenting the 
calculus extensively) to have a more complete presentation. By the way, we are 
also going to extend the original calculus and introduce two new constructs. 

The polyadic ambient calculus is mainly composed of processes. As in many 
other process calculi, we have an inactive process 0 which does nothing, we 
can compose processes in parallel {P \ Q), we have a construct to replicate a 
process as many times as necessary (!P) and we have a restriction operator 
{(i/n)P) which introduces a new name n and restricts its scope to the inside 
of P. In the TT-calculus, those names represented channels; here they represent 
ambient names. In our calculus, we also declare the type of this ambient name 
{{vn : Amh"^ \Tn,T!^)P), but we will not care about that until the next Section. 

An ambient is composed of a name n and a process P which is running in- 
side the ambient. We write it n[P]. Here we extend the syntax of the polyadic 
ambient calculus with a new construction. Up to now, the locking-unlocking of 
an ambient was defined only in the declaration of its name. So ambients with 
the same name had all the same locking annotation. We kept this possibility 
(extending it with ordering), but we changed the syntax of ambient construc- 
tions: n[P\ is an unlocked ambient and n|P] is a locked one. So we can now have 
an explicit construct which guarantees that an ambient will be locked. This can 
seem redundant with the locking annotation in the type declaration of n, but 
from a programming point of view, it just appears to be more flexible. 

The process M.P executes the action induced by the capability M and then 
continues with the process P. There are three kinds of capabilities: one to enter 
an other ambient {in n), one to exit {out n) and one to open up an ambient 
{open n). To build such a capability, a process must know the name n. It can 
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also have received the capability via communication (see below) . Implicitly, it is 
impossible to reconstruct the ambient name n out of one or all of these capabil- 
ities (this is important to prove the security of a firewall for example). (In the 
original calculus, we could also compose capabilities into paths {M.M') where e 
was the empty path; we do not use these constructs in this paper, but they are 
not difficult to handle; see |Zim99| for details.) 

The use of these three capabilities is given by the following reduction rules: 

n[in m.P \ Q] \ m[R] — > m[n[P | Q] | i?] (Red In) 

m[n[out m.P | Q] | i?] ^ n[P \ Q] \ m[R] (Red Out) 

open n.P \ n[Q] — > P | Q (Red Open) 

with the convention that in (Red In) and (Red Out), each occurrence of [.] can 
be replaced by |.] (the ambients can be locked or unlocked), whereas in (Red 
Open), the ambient n must be unlocked. 

Here we add our second extension: the imm capability. A process containing 
it must be immobile. We added it, because also for immobility we want a lan- 
guage construct which obliges a process to be immobile, instead of delegating it 
to the types (an other reason is that, without it, no construct would introduce 
the mobility annotation Z in our type system) . The corresponding reduction rule 

imm.P — > P (Red Imm) 



is very simple and actually “ignores” the imm capability. In fact, imm is only 
useful when typing: if imm.P is typable, then P cannot contain moving capa- 
bilities (even by receiving them). At run-time, only this guarantee is important 
and we can throw imm away. 

Finally, we have two communication primitives: : Wi, . . . ,nk : Wk).P 

and (Ml x • • • x M^). The first waits for a tuple of values of respective types 
Wi, . . . , Wfe, and binds them to the variables ni, . . . , in the continuing process 
P. The second outputs a tuple of values. Note that the output is asynchronous 
(no continuing process). The corresponding reduction rule is: 



(ni : Wi,...,Uk : Wk).P \ (Mi,...,Mfc) 

^ P{ni^Mi,...,nk^Mk} 



(Red Comm) 



The five last rules just say that reduction (i.e. computation) can also occur 
beyond scope restrictions, inside ambients, or by using the structural congruence 
rewriting (which we will not detail here): 



P ^ Q ^ {vn: W)P {vn : W)Q 
P ^ Q ^ n[P] n[Q] 

P ^ Q ^ n|P] ^ nlQj 

P^Q^P\R^Q\R 

P' = P,P ^ Q,Q = Q' ^ P' ^ Q' 



(Red Res) 
(Red Ambo) 
(Red Amb.) 
(Red Par) 
(Red =) 



Here is the complete syntax of our calculus. Note that it allows strange 
expressions like (in n)[P] or out (out n).P. Rejecting those nonsense terms will 
be an automatic property of the type system. 
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P,Q processes 



{un : Amb^[T, T'])P 


restriction 






0 


inactivity 


M ::= 


expressions 


P 1 Q 


composition 


n 


name 


\P 


replication 


in M 


can enter into M 


M[P] 


unlocked ambient 


out M 


can exit out of M 


M{P\ 


locked ambient 


open M 


can open M 


M.P 


action 


imni 


immobility 


(m : Wi, . . . ,rik : Wk)-P input 








async output 







Terms are also identified up to the consistent renaming of bound variables, 
in the restriction and input constructs. Thus, we can always suppose that all the 
ambient names and input variables are distinct. 

3 A Type System with Subtyping 

In order to verify some properties of processes in the ambient calculus, Cardelli 
and Gordon proposed a first type system in |( XhTO] and extended it with Ghelli 
in [GGG99j . It assured that a well-typed process could not cause certain kinds 
of run-time fault: exchanging values of the wrong type, moving or opening an 
ambient if it was not allowed to... We will always refer to this type system as 
“GGG”. 

In this Section, we will describe a new type system, extending (in some way) 
GGG. We will first introduce some new types and define an ordering relation on 
them (subtyping is essential to be able to write a typing algorithm). Then, we 
will give new typing rules and show some properties. 

3.1 Type Definitions 

We start by giving all the definitions of our types: 



Z ::= 


mobility annotations 


Y :■= 


locking annotations 


4 


mobility unknown 


-L 


locking bottom 


y 


immobile 


• 


locked 


rv 


mobile 


o 


unlocked 




mobile and immobile 


r 


locking top 


0,1 ■■= 


input/output types 






± 


bottom value 


T ::= 


process type 


Wi 


X • • • X Wk (fc > 0) tuple 




— * I 


T 


top value 






W ::= 


message types 







Amb^ [T, T'] with T <T' (see below) ambient name 
Cap[T] capability 
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3.2 Intuitive Meanings and Ordering 

The main intuition in defining the order is to always respect the subsumption 
rule: if P has got type T and T <T' , then P has also type T' . There are a few 
changes in the syntax of types compared to those of CGG. They were motivated 
by the introduction of subtyping: what seems “intuitive” is not always correct 
(this problem appeared clearly in the referee reports for a draft of this paper: 
whereas one referee found “may be mobile < immobile, may be opened < locked” 
as the “implicit” relation in GGG, an other one stated that “immobile < mobile 
and locked < unlocked” was the “obvious subtyping of GGG”). 

Mobility Annotations What is the “obvious” subtyping of GGG ? It depends 
on the point of view. If we consider that an immobile process can generate 
movements, we should define Y. < r\. For example, everybody would say that the 
process 0 is immobile, but in order to type in n.O, we should also be able to say 
that 0 is mobile. On the other hand, with this definition, if we restrict a process 
to stay immobile, it can always remove this restriction with the subsumption 
rule, which speaks more in favour of rv < If. 

The subtyping relation we need depends on the property we privilege: gener- 
ation of movement or restriction of immobile processes. If we want to keep both 
results, we have to introduce a new symbol -L., and keep r\ and Y incomparable. 
In order to have a complete lattice, we introduce also (which will also be useful 
in the typing algorithm and for the ambient types), and we define _h. < If , rv< 1^. 
Since this structure is a complete lattice (4 points with a lozenge-like ordering), 
there is no problem to define meet and join operations on it. 



Process Type In the type system of GGG, there was only one single term 
representing the type of values exchanged in the ambient {Shh or a tuple). 
In presence of subtyping, we should now accept that outputs and inputs have 
different types. For example, the output of the integer 1 should be accepted 
by an input variable of type Real. So we decided to track the types of output 
and input values exchanged in the ambient. If a process is valid, it must then 
have type I with O < I to ensure that any output can be read by 

any input instruction (note that this is not specified in the syntax of T, but 
will be a property of the type system; see also below). Later, it appeared that 
Yoshida and Hennessy used the same approach for a higher-order 7r-calculus with 
subtyping ( [YH99] ). Moreover, a valid process should not contain both mobile 
and immobile instructions; consequently the mobility annotation is forbidden 
for processes. 

Definition 1 (Validity). A process type I is said to be valid if Z 

and O < I. 



For the process type, we define : 



Z < Z' 

o<o' 

!>!' 



I <^'o' 1' 
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I A^'O' I' = O AO' IV I' 

IW O' I' = ^'^^'oyO''^lAl' 

The set of all process types has a complete lattice structure, with -*^_L T 
as the minimal element and _L as the maximal one. But, if we consider 

only valid processes, there are many maximal types: all O and -O O 

for every input/output type O. 



Input/Output Types Then, we have to define what are output and input 
types. As before, it can be a tuple. But we had to replace Shh by two different 
values, one for outputs {Shhout = -L) and one for inputs {Shhin = T). Then it 
appeared useful to consider that the meaning of these values was different for 
the input and output terms: 



Value 


Output term (O) 


Input term (I) 


T 


No output 


There can be an 




Dumb process (Shhout) 


input of any type 


T 


There can be an 


No input 




output of any type 


Deaf process (Shhin) 



For example, if there are two outputs of different arities in parallel, the re- 
sulting process has type T for O. With the condition that O < /, the process 
is valid if and only if / = T, i.e. if there are no input instruction (you can say 
anything only if nobody is listening). A similar argument holds by exchanging 
inputs and outputs. This is a different vision from [YH99j where T was forbidden 
as an output type and T as an input type. 

For the input/output types, we define the partial order: 



T < TTi X • • • X Wfc < T Vfc VWi 
Wi X • • ■ X Wfe < Wi' X • • ■ X Wi<Wi 'il<i<k 



This definition induces a complete lattice structure on input/output types, 
so that the meet and join operations are always defined (the obvious cases with 
T and T are omitted for simplicity): 



Wi X ■■■ xWk AW[x ■ ■■ xWk, = < 



Wi X ■■■ xWkVWix ■ ■■ xWk, = < 



Wi A Wi' X • ■ • X Wi A ifk = k' 

and Wi A VF/ is defined V 1 < i < A; 

T otherwise 

WiVW{ X ■■■ xlViVlVl^: if k = k' 

and Wi V Wi is defined V 1 < i < A; 

T otherwise 



Message Types Finally, we have a type for messages that can be exchanged 
in an ambient. It can be either an ambient name or a capability. It is safe to add 
here other common types like Int, Bool... 
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A capability Cap\T] contains simply the effects T which will be released if 
the capability is executed. And the ordering is natural: 

Cap[T] < Cap[T'] T < T' 

Cap[T] A Cap[T'] ^ Cap[T AT'] 

Cap[T]\/ Cap[T'] ^ Cap[T \/ T'] 

An ambient name is composed of a locking annotation and two types rep- 
resenting the processes running inside the ambient. Concerning the locking an- 
notations, we can repeat the same discussion as for mobility annotations. Thus 
we add two new symbols _Lq and T°, and define -U < •, o < T° with • and o 
incomparable. 

With _Lo all three constructions n[P], n|P] and open n are allowed. With o 
only n[P] and open n are allowed, and with • only n|P] is allowed. With T° 
none of them is allowed. However, note that all other constructions (like in n or 
(n)) are valid with any locking annotation for n. 

Concerning the process running inside an ambient, it seems that only one 
process type would be enough (it was in CGG). With subtyping, we would like 
to say that a process is allowed to run inside an ambient of type Amh"^ \T] if 
and only if it has a type T' < T (so that 0 is always accepted) . T represents the 
maximal effects allowed in the ambient. 

Now, what is the natural ordering for ambient names ? Suppose Amb^ [T] < 
Amb''' ]T'] when T < T' . Then, if n has type Amb^ \T], it has type Amb^ \T'] by 
the subsumption rule for any T' > T. Thus, n[P] is typable for any process P 
of arbitrary type T' , which is contrary to our requirements. 

On the other hand, suppose Amb^ [T] < Amb'^ ]T'] when T' < T. Then, 
if n has type Amb'*' [T], it has type Amb^ [Tmin] where Tmin is the minimal 
process type, open n has type Cap[Tmin], which is contrary to our intuition: n 
can contain processes with “stronger” effects. 

This explains why we need two process types in the type of an ambient 
name, with two different orderings. In Amb^ ]T,T'], T represents the maximal 
type allowed for processes inside the ambient (i.e. all valid processes also have 
type T) (cf. rules (Proc Ambo) and (Proc Amb,) in Section FOl) . whereas T' 
represents the maximal effect a valid process can produce (cf. rule (Exp Open)), 
thus the condition T <T' to be coherent. We define: 



(Y<Y' 

Am6^[Ti,T2] < Amh^'[T[,T2] ^ J Ti > T[ 

I T2 < Ta 

At first sight, it could seem strange to declare a new ambient name with 
T < T': if we specify the maximal allowed type T, why would we say that worse 
effects T' can appear when opening an ambient of that name ? In fact, we need 
them to be consistent with the rest of the calculus. Suppose we want to write the 
program (n) | (m) | (x :7).P where n and m accept processes of type and 
respectively. What input type should we declare for x ? We can use the type of 
the parallel output (n) \ (m), which is with our ordering Amb'^[TnATm, Tn VT^]. 
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In general, there is no reason to have T„ A Tm = T„ V This explains why we 
cannot replace T and T' by one single process type in an ambient name. 

Note that the conditions Z and O < I are not required for an ambient 
name. For example, Amb^ T _L] is the type of an ambient name 

allowing all processes, whereas the ambient name Amb^ T,-^_L-^ T] is 

the most restrictive one, allowing only processes which do not have inputs or 
outputs, i.e. only processes behaving like 0. 

We can only define partial meet and join operations, since there are some 
incomparable types (the other cases are undefined): 

Amb^[Ti,T2\AAmb^'[T[,T^] ^ [Ti V Tj, Ta A Tj] 

if Ti V < Ta A Tj (or equivalently if Ti < Tj and T[ < Ta) 

Amb'^[Ti,T2\W Amb'^'[T[,T2] ^ [Ti A Tj', Ta V Tj] 

There is no comparability between ambient names and capabilities. It is 
safe to add other useful types here with their usual ordering (for example, Int < 
Real), with no comparability with ambient names and capabilities. 

The set of capability types has a structure similar to the process types 
(i.e. a complete lattice). The set of ambient names has a maximal element 
{Amb~^\^l. T,^T T]), but infinitely many minimal elements: all the 
types Amb-^\T,T] for every process type T. 



3.3 Typing Rules 

Having defined the types and explained their ordering, we can now give the 
typing rules of this new type system. A feature of the present approach w.r.t. 
CGG is that we avoid to introduce arbitrary types in the conclusions of typing 
rules. However note that this derivation system is also not deterministic because 
of the subsumption rules and the shape of some rules (for example, in (Proc 
Par), the same type T appears in two premises). 



Good Environment {E h o) 

(Env 0) 



(Env n) 



E\- <> n ^ dom(E) 



ho ' ' A, n: IF ho 

These two rules are exactly the same as in GGG. 



Good Expression of Type W (E h M : W) 

E\- M : Amb^[T,T'] 



(Exp In) 



(Exp Open) 
(Exp n) 



E \- in M : Cap[^l. T] 
E h M : Amb°[T,T'] 



E h open M : Cap\T'\ 
E',n: W,E" ho 
E'n: W,E” \-n-.W 



(Exp Out) 

(Exp Imm) 
(Exp Sub) 



E\- M Amb^[T,T'] 

E h out M : Cap[~^l. T] 

Eho 

E h imm : Cap[-1. T] 
E\- M :W IF < IF' 



E h M : IF' 
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In (Exp In), (Exp Out) and (Exp Imm), the type in the conclusion of the rule 
is the minimal effect (or constraint) that the corresponding instruction produces. 
In (Exp Open), we use the maximal effect contained in the ambient name and 
we check that M is an unlocked ambient. In (Exp Sub), we allow to upgrade the 
type of an expression. The other rules are identical to those of CGG. 



Good Process of Type T {E \- P : T) 

E\- P:T E\-Q :T 



(Proc Par) 



E\- P \ Q-.T 



(Proc Action) 



E\- M ■. Cap[T] E\- P -.T 



(Proc Zero) 



E\-o 



h 0 : Aj_ 'N.i- T 



(Proc Repl) 



E h M.P : T 
E\- P -.T 



EHP-.T 



(Proc Res) 
(Proc Ambo) 
(Proc Amb.) 



E,n : Amb^[T„,T^] h P : T 
E\- {un: Amb^[Tn,TI,])P : T 
E\- M : Amb°[T, T'] E P : T 
E h M[P] : A_L ^ T 
E\- M : Amb‘[T, T'] E P : T 
E h M|P] : A_L ^ T 



(Proc Input) 



P,m : TTi,...,nfe : ITfc h P : / I < Wi X 



X Wk 



(Proc Output) 



P h (m : lEi, . . . , rifc : Wk).P 
E\- Ml-. Wi ■■■ E^ Mk-.Wk (E\-o A k^O) 



(Proc Sub) 



E\- (Mi,...,Mk) : AWi X ■■■xWk'-^T 
P h P : T T <T' T' valid 



P h P : T' 



(Proc Par), (Proc Action), (Proc Repl) and (Proc Res) are the same rules as 
those of GGG (with a syntax modification for (Proc Res)). In (Proc Zero), we 
use the minimal process type. 

In (Proc Ambo), we check that M is an unlocked ambient name and that P 
has the type of an allowed process inside M (with the subsumption rule, we can 
always upgrade it or decrease the type in the ambient name so that they match) . 
Like for 0, we use the minimal process type in the conclusion of the rule. (Proc 
Amb,) is similar. 

In (Proc Input), we just need to check that the input type of the process P 
is below the type generated by the input (i.e. is more specific since the ordering 
for input types is contravariant) . This is valid: every input type is accepted in 
the conclusion provided that it covers the input Wi x • • • x Wk ■ If P has a bigger 
input type (T for 0 for example), it must first be upgraded with the subsumption 
rule before applying (Proc Input). 

In (Proc Output), we just give the minimal effect: the output of a type 
Wi X • • • X Wk- Since the output is asynchronous, there is no condition to check 
like in (Proc Input). 

(Proc Sub) is the classical subsumption rule, with the additional condition 
that the new process type must be valid (we are explicitly typing a process here 
and not a capability or an ambient name). 
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3.4 Results 

Theorem 2 (Subject reduction). If E P \T and P ^ Q, then E \- Q : T. 

Theorem 3 (Validity). If E \- P : T, then T is valid. 

By this last theorem, we are sure that a well-typed process will never cause 
run-time faults, i.e. there will never be an exchange of incompatible values dur- 
ing execution and it cannot contain instructions requiring both mobility and 
immobility (like in imm.in n.P). Another desired property is that we do not 
want an ambient to be opened if it is locked. This property is a direct result of 
the type system: the instructions open n and n[P] can be typed only if we can 
prove that n is unlocked. 



4 A First Typing Algorithm 

In this Section, we are going to deduce a typing algorithm from the type system 
we introduced in the previous one. Then, we will see that this algorithm returns 
exactly all the types (in a certain sense) that could be derived. 



4.1 Typing Rules 

Definition 4. An environment is said to be well-formed if all the names it 
contains are different. This is of course equivalent to E \- o. Algorithmically, it 
just consists in checking that all the names are different. 

For any well-formed environment E, we define an algorithm returning the 
type of expressions and processes by the following rules. For every undefined case, 
we will say that the algorithm fails. Note that even if we write it as derivation 
rules for simplicity, it can also be expressed directly in an algorithmic way. Note 
also that the algorithm can be implemented in a parallel way when there are 
several recursive calls (for instance in (Type Par)). 



Type of an Expression M (Type{E, M) — W) 

Type{E,M) = Am6^[T,T'] 



(Type In) 
(Type Out) 
(Type Open) 



Type{E,in M) = Cap[^ E T] 

Type{E,M) = Amb^[T,T'] 
Type{E,out M) = Cap[^E T] 
Type{E, M) = Amb^ [T, T'] Y <o 



(Type Imm) 
(Type 



Type{E,open M) — Cap[T'\ 
Type{E, imm) = Cap[-T T] 
Type{{E' ,n : W,E"),n) = W 



For each message type, we always return the minimal type required by this 
capability (for example, Cap['^E T] for in M). In (Type Open), we return 
the maximal effects T' which can appear when opening an ambient of that name 
and we check that this ambient is unlocked by F < o. 
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Type of a Process P (Type{E, P) — T) 

Type{E, P) ^ T Type{E, Q) = T' T V T' valid 



(Type Par) 

(Type Action) 
(Type Zero) 



Type{E,P | Q) = T V T' 

Type{E, M) = Cap[T] Type{E, P) = T' T V T' valid 



Type[E,M.P) = T V T' 
(Type Repl) 



(Type Ambo) 
(Type Amb.) 
(Type Input) 



Type{E,P) = T 

Type{E, 0) = At ^ T ' Type{E,\P) = T 

iTvne Rpsi [Tn , K]) , P) = T 

^ ’ Type{E, {vn : Amh^[T„, T4])P) = T 

Type{E, M) = Amb’^[T, T'] Type{E, P) = T" T" < T Y < o 
Type{E,M[P]) = AT T 

Type{E, M) = Amh'^[T, T'] Type{E, P) = T" T" < T Y <• 
Type(E, MfPj ) = At T 

Type{{E,ni : ITi, . . . , Ufc : W^), P) = ^0'^I O < ITi x • • • x ITfc 
Type{E, (m : ITi, rife : Wk).P) = I AWi x ■ ■ ■ x Wk 
Type{E,Mi) = Wi ■■■ Type{E,Mk) = Wk 



(Type Output) 



Type{E,{Mi,. . . ,Mk)) = AWi x ■ ■ ■ x Wk T 



In (Type Par), we just take the join of the two sub-processes types, ensuring 
first that the resulting process is still valid. 

In (Type Ambo), we must check that the type T" of P is accepted by this 
ambient (T" < T) and that the ambient can be opened {Y < o). 

In (Type Input), we add the information of the input instruction by returning 
the meet of / and Wi x • • • x Wk- We must also check that O < ITi x • • • x Wu 
in P, to ensure that O < I A Wi x • • • x Wk in the resulting process. 

In (Type Output), we just put the information of an output of type Wi x 
• • • X Wk- Since there is no continuation, there is nothing to check here. 

The other rules are similar or (quite) natural. 



4.2 Results 
Theorem 5 (Soundness). 

- IfType{E,M) = W, then E\- M --W- 

- IfType{E,P) = T, then EA P :T- 

Theorem 6 (Completeness). 

— If E \- M ■- W, then the algorithm succeeds on M and Type{E, M) < W- 

— If E \- P ■- T, then the algorithm succeeds on P and Type{E, P) < T. 

From those two theorems, we easily deduce the property of minimal type for 
our type system and that the algorithm is able to compute it efficiently. 

Corollary 7 (Minimal Type). The set of all possible types for a typable ex- 
pression or process has a minimum and this minimum is precisely the type re- 
turned by the algorithm- 
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5 Type Inference 

In the previous Section, we described a deterministic typing algorithm. This is a 
satisfactory result, to be compared to the nondeterministic type system we had 
before. But, up to now, this algorithm performs only type-checking: the program- 
mer must still annotate explicitly all ambient names and input variables with 
their types. To go one step further, the natural extension would be to remove 
these type annotations and try to design a type-inference algorithm. Unfortu- 
nately, even if at first sight this seems to require only minor modifications, some 
new and difficult problems appear if we want to keep the subtyping relation. So 
we will have to go a little back and restrict our problem to the original type 
system of CGG. For this system, we will show that it can be completely and 
efficiently solved with a Damas-Milner style algorithm. 



5.1 Background 

For the syntax, we will consider the calculus we studied since the beginning, that 
is with the two new constructs and the associated reduction rules (they do not 
bring any new difficulties). For the type system, we will nearly take the typing 
rules of GGG, as they are described in |GGG99j . We modify them only to handle 
the two new constructs. 

Instead of simply removing the type annotations, we keep them but allow to 
write type variables instead. For this, we must extend the definitions of types 
by adding an infinite set of variables for each of them. More generally, for the 
same letter, the lower case one will denote a type variable and the upper case 
one will denote a metavariable (as before). 

Now we can write expressions like {x : w).P or {vn : Amh'^\t\)P, or even 
{x : Cap[^u]).P . In fact, we allow to mix both type variables and explicit types 
in a same term or even in a same type expression. By this mean, we get a more 
generic algorithm, and this property can be useful in practice: for example, if 
you want to check an insecure code, you should be able to constraint some of its 
types by specifying them explicitly before applying the type-inference algorithm. 
Note also that one can express equality constraints between types just by using 
the same variable: in {x \ w).P \ {y \ w).Q, the input variables x and y must 
have the same type. 

5.2 The Algorithm 

We first need some classical definitions and results: a substitution is a total 
map from the set of all type variables (of any kind) to types of the same kind. 
We will denote them by the letters cr, 0, p... The empty substitution is the 
identity function and will be noted id. Finally, the composition of substitutions 
is defined in exactly the same way as functions. We extend naturally substitutions 
to complex types (and not only type variables), to processes (by replacing type 
annotations in input and restriction constructions) and to environments. 
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An unifier of two types Xi and X2 is a substitution cr such that cr(A'i) = 
a{X2). Since the types in our system are simple trees, we know that there is a 
sound and complete unification algorithm for those types. It returns the principal 
unifier of two types (when it exists; otherwise it fails). We will call it mgu {., .) 
(it is not difficult to write its rules explicitly). 

We can know give the rules of the typing algorithm. They are used to infer 
judgments of the forms Infer{E, M) = {W, a) and Infer{E, P) = (T, cr), where 
IT or T is the most generic possible type for M or P (possibly containing type 
variables), cr is a substitution representing the constraints on the type variables 
in M or P, and if is a well- formed environment. 

In the following rules, the premises must be read (and applied) from left 
to right. We do not detail how the algorithm gets new type variables. We will 
only consider that whenever a variable is declared new, it is different from all 
type variables previously used. In practice this can be achieved by using a global 
counter to number new type variables. 



Type-Inference for an Expression M (Infer{E,M)) 

Infer(E,M) = {W,(j) y,t new mgu(W,Amb^[t])=p m new 



(Infer In) 



(Infer Out) 



Infer{E,in M) = (Cap[^u\, pa) 

Infer(E,M) = {W,a) y,t new mgu{W, Amb^ [t]) = p u new 



(Infer Open) 



Infer{E,out M) = {Cap[^u\, pa) 
Infer{E,M) = iW,a) t new rngu{W,Amh°\t]) = p 



(Infer Imm) 
(Infer n) 



Inf er{E, open M) = (Cap[p{t)], pa) 

u new 



Infer{E,imni) = (Cap[-M],id) 



Infer{{E,n: W,E'),n) = (IT, id) 



Type-Inference for a Process P (Infer{E,P)) 



(Infer Par) 
(Infer Zero) 



InferiE, P) = (T, a) Infer{a{E),a{Q)) = (T', a') 

mgu{a'(T),T') = p 



Infer{E,P \ Q) = {p{T'),pa'a) 



t new 



(Infer RepI) 



Infer{E,P) = (T,a) 



(Infer Action) 



Infer{E, 0 ) = (t, id) ' Infer{E,\P) = (T,a) 

Infer{E, M) = (IT, cr) Infer{a{E),a{P)) = (T, a') 

mgu{a'{W),Cap[T]) = p 



(Infer Res) 



Infer{E,M.P) = (p{T),pa'a) 
Infer{{E,n-. Amb^[T]),P) = (T',a) 



Infer(E, (vn : AnrdX\T\)P) = (T', a) 

InferiE, M) = (IT, cr) InferiaiE),a{P)) = (T, a') 

mgu{a' (W), Amb°[T]) = p t new 
Infer(E,M[P]) = {t,pa'a) 



(Infer Ambo) 
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(Infer Amb.) 



Infer{E, M) = {W, a) Infer{a{E),a(P)) = {T, a') 

mgu{a' (W), Amb‘[T]) = p t new 
Infer{E, M|P] ) = {t, pa' a) 



Infer{{E,m :Wi,...,nk: Wk),P) = (T,a) 
z new mgu(T,^a(Wi) X ■ ■ ■ X a(Wk)) ^ p 
^ Infer{E, (m : VPi, . . . , : W,).P) = {p{T),pa) 



(Infer Output) 



Infer{E,Mi) = (VPi,ui) Infer{ai{E),M2) = (W2,a2) 

Infer{aj,-i . . ,ai{E),Mk) = (Wk,ak) z new 
Infer{E, {Mi, . . . , Mk)) = 

Ok ■ ■ ■a2{Wi) X ■ ■ ■ X ak(Wk-i) X Wk,ak . . .ui) 



5.3 Results 

Theorem 8 (Soundness). IfInfer{E,P) = (T,a), then cr{E) \~cGG ct(P) : 
T. Moreover, a'a{E) \~cGG for any substitution a' (we will say 

that all these derivations are solutions returned by the inference algorithm) . 



Theorem 9 (Completeness). If there is a type T such that a{E) \~cGG ct(P) : 
T (i.e. if the process P is typable in the environment E after performing some 
substitutions on type variables), the inference algorithm Infer{E,P) succeeds 
and a{E) \~cGG <^{P) ■ T is one of the returned solutions. 

5.4 Type Inference with Subtyping 

Returning back to the original problem, can we do the same as above with the 
type system with subtyping ? Adding subtyping brings many problems, mainly 
because there is no minimal type for ambient names and because we get ordering 
constraints due to ambient names and valid processes. Some similar problems 
appeared in the type system of Abadi and Cardelli for object calculus. In this 
case, Jens Palsberg gave a solution in |Pal95l . by building a graph of constraints 
and checking some properties on it. Maybe the same approach would be possible 
with the ambient calculus, but our attempts in this way failed. Up to now, all we 
could do is build a set of constraints that type variables should satisfy in order 
to get a solution. But solving it remains an open problem (see [Zim99] for more 
details and explanations). 

6 Conclusion 

We have extended the previous type system for mobile ambients with new types 
and with a subtyping relation. We gave the corresponding typing rules and de- 
duced a type-checking algorithm. We also gave a type-inference algorithm for 
CGG, but the problem of solving the constraints set in the system with subtyp- 
ing remains open. 
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These algorithms are efficient and could be implemented quite easily. To our 
knowledge, there are two “implementations” of ambients so far: a Java applet 
from L. Cardelli and a translation into the join-calculus in a modified version of 
Objective Caml (' |FS99| '). None of them use types for now. 

An other primitive was introduced by Cardelli-Ghelli-Gordon in [GGG99| : 
the primitive go, which performs objective moves. To prevent some dangerous 
effects such moves can induce (entrapping of an ambient), they extended the 
type system so that the type of an ambient name says explicitly if the ambient 
allows them or not. We did not keep this primitive to simplify the notations for 
ambient names, but we checked that all our work and algorithms could be easily 
extended so as to include go. 

In jLSQQ], Levi and Sangiorgi studied plain and grave interferences in the 
ambient calculus. They proposed a syntax extension along with a new type sys- 
tem to prevent grave interferences. Future work may be to extend our subtyping 
relation and algorithms to their system. 
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